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RESULTS OF THE TSUNAMI FIELD TRIALS: 

POSITION LOCATION IN MACRO AND MICRO CELL ENVIRONMENTS 
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Introduction 

The CEC ACTS TSUNAMI (II) project involved the development, integration and field trial 
evaluation of a DCS 1800 2 nd generation mobile base-station with an adaptive antenna. The 
consortium 2 designed and built a 8 element transmit and receiving adaptive antenna system, 
employing digital beamforming techniques operating over a DCS 1800 air interface and the 
performance assessed for a variety of different operational scenarios using the Orange testbed 
facility in Bristol. The TSUNAMI (II) field trial system included both transmit and receive 
calibration facilities in order to obtain adequate accuracy from the digital beamforming techniques. 
The ability to track users is a fundamental feature of this technology. 

The application of array signal processing to provide mobile location information has been assessed, 
using the data collected during the trials and compared with position information, obtained via a 
commercially available GPS receiver. This application is particularly relevant to 'hot-spot' 
detection in cellular networks as well as the '91 T location requirement in the US. 

Mobile location 

The problem of accurate position estimation of cellular subscribers is receiving growing attention 
for a number of reasons. In the United States regulations are to be introduced which will require 
wireless operators to locate accurately subscribers making emergency calls ('911'). Network 
operators could also use subscriber location information for cellular planning (i.e. traffic 'hot spot' 
detection), location sensitive billing and also for offering new services. This type of facility can be 
provided through the use of array signal processing and can be regarded as a 'value added service ' if 
the array is also providing capacity enhancements to the network. 

A number of technical solutions have been proposed to this problem. Most of these techniques rely 
on measuring the time of arrival (TO A) of the signal from the mobile station at three or more 
receiving points, from which the location of the mobile station can be calculated. Other techniques 
use direction of arrival (DOA) measurements from at least two receiving locations. Here an 
alternative technique using an antenna array at a single receiving point to estimate mobile 
transmitter location by combining time of arrival and DOA information is proposed. This technique 
has the advantage that, in some applications (particularly traffic hot spot detection), only one active 
receiving site is required, and furthermore this can operate independently to the rest of the 
communications network. 

This paper reports on work carried out by ERA Technology on results obtained by using mobile 
location algorithms on the TSUNAMI (II) field trial data which was conducted as an adjunct to the 
TSUNAMII work packages. 



1 The authors are with ERA Technology Ltd. 

2 Consortia comprising: ERA Technology (coordinator), Motorola ECID, Orange PSC, University of Bristol, Wireless 
Systems International, CAS A, Robert Bosch, University of Aalborg, France Telecom CNET, University Polytechnic of 
Catalunya. 
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Algorithm Description 

In the proposed algorithm, the processing for source location consists of three key stages. Initially, 
the channel impulse response for each element of the antenna array is estimated from the signal 
vector received at the base station. Then, estimates of the direction of arrival (DOA) and ranges are 
extracted. Finally, these raw estimates are filtered to obtain the estimates of mobile trajectory. This 
approach has a number of advantages, in particular: 

• The method is simple and fast, since it is based on channel impulse response estimates and is 
therefore suitable for real-time applications. 

• The influence of interference from other cells is limited, since channel impulse responses are 
discriminated by means of the training sequence de-correlation properties. 

• Post-processing is applied to estimates to increase position accuracy. 

Each GSM burst contains a 26-bit training sequence, shared by all mobiles of a given cell. 
Interference from o&er cells is limited by using orthogonal training sequences. When modulated, - 
the training sequence produces a 16-bit sequence, periodically extended by 5 bits to either side. 
When MSK modulated, the 16-bit sequence has an ideal (Dirac) auto-correlation function. GMSK 
modulation broadens the main lobe and introduces side lobes. 

'Channel impulse responses, H, are estimated and stored at the J gsestation e quipped with the 
appropriate array and signal processing hardware. This is simply achieved by cross-correlating the 
modulated training sequence, corresponding to the relevant mobile station, with the received burst. 
When channel distortions have been introduced by multipath, H is a weighted sum of Dirac 
functions. In order to improve the time resolution of the paths, the impulse responses can be 
interpolated. Finally, H is pre-multiplied by the beamfoimer weight coefficients matrix, W, in order 
to obtain the spatial impulse response for the antenna array, He. The squared magnitudes of the 
entries in He represent the received energy from the burst, as a function of time of arrival and 
direction of arrival. The dominant multipath components are estimated from the highest values of 
He- DOA estimates are given by the row index of the peaks, and time of arrival estimates by the 
corresponding column index. The range estimates are found by multiplying the time of arrival 
estimates by the speed of light. In the following analysis, only one peak is selected from each 
spatial impulse response estimate obtained from the measured data. 

The discrete nature of the matrix W imposes the DOA accuracy of one degree steps, covering -60° 
to +60° with respect to the array boresight. Range accuracy is limited by the number of samples per 
bit used in the interpolated channel impulse response estimation, and here 32 samples per bit have 
been used. The range resolution is approximately 17m, since the sidelobes of the autocorrelation 
function of the training sequence tend to bias these estimates. 

Post-processing was employed to reduce quantization noise in the location estimates. Here, DOA 
and range estimates are made at regular intervals. Then, histograms of the DOA and range estimates 
are computed. These histograms are almost always mono-modal, except in the presence of 
interference, which introduces a smaller second peak in the histogram. The values corresponding to 
the histogram maxima only are used as representative estimates for the interval if their relative 
frequencies are greater than a predefined threshold. Otherwise, representative estimates are obtained 
from the linear extrapolations of the previous values. 
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Macrocell Results 

The algorithms have been tested for one stationary test, with line-of-sight to the antenna mast, and 
seventeen moving tests. 

Results are reported in Table 1, where angles are given in degrees, distances in metres and error 
probabilities in percentage. Figure 1 represents the true and estimated trajectories for the test 045. 
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Table 1: Results for the macrocell trials 




Figure 1 : GPS (solid line) and estimated (dashed line) positions for the test 045. 
The base station is at the origin of the figure. 
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Macro-cell location performance 

In Table 1, location performance has been evaluated in terms of the mean and standard deviations of 
the DOA, range errors, as well as the root mean square position error (RMSE), for both estimated 
and GPS derived position location. Furthermore, the accuracy of the location estimates is assessed 
by counting the number of position estimates within increasing radii circles (50m, 125m, and 250m) 
centred on the true positions. Since differential GPS equipment was not available during the tests, 
the GPS measurements are subject to a bias that varies between experiments. When the exact routes 
are known, the measured GPS positions are overlaid onto an Ordnance Survey map and the GPS 
trajectories are shifted to be as close as possible to the exact routes. Thus, the results include the 
influence of any short-term GPS positioning errors. 

Macro-cell position accuracy 

The following parameters were found to be particularly sensitive in terms of the resultant position 
accuracy: 

• The number of samples per bit and the number of angles used in the channel impulse response _ 
estimation, gives an uncertainty on estimates of 35m and 1 °. Filtering was introduced to decrease 
this effect. 

• Drift in frequency in the mobile station, this puts a lower limit on the range variance of 
approximately 137 metres. 

• Drift in frequency has no impact on the DOA estimates, but angle errors dominate in the macro 
tests: A variation of 1° at 5km corresponds to 75m. 

• Errors in location system calibration add bias terms that are difficult to estimate. 

Some problems have been identified for some of the test runs. 

• At the end of two test runs the interference was selected instead of the signal. 

• For three tests the estimated trajectory moves away from the true position, probably because of 
the extrapolation method used between two reliable points. 

• At the ends of some test runs the angles are greater than +60°. The mobiles are then out of range 
and the DOA estimates correspond to side lobes of the channel impulse responses. 

For the remaining nine tests, the average RMS error is 136m. On average, the estimated positions 
are within 50m of the true positions one third of the time, within 125m three quarters of the time and 
96% of the estimates are within 250m of the true positions. 

Microcell Results 

Here the algorithm was applied to six stationary tests as well as five moving ones. 

For the static tests the general results are given in Table 2 where angles are given in degrees, 
distances in metres and error probabilities in percentage. Figure 2 represents the exact positions and 
the mean positions of the estimates on the map. 
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Table 2: Results for the microcell, stationary tests. 



For test 204, range estimates are always around the same average value, but the DOA estimates 
take values around -21 degrees or -45 degrees, probably corresponding to two different 
reflections. 

The frequency/time drift inside the mobile terminal is particularly visible and puts a limit on the 
range accuracy. This time drift implies shifts of one fourth of a bit period for the travel from the 
base station to the mobile and back to the base station, i.e. a corresponding range shift of 137m 
approximately. 

Values for the standard deviations are generally similar to those obtained for the macro tests. 
Mean values are larger than those obtained with the macro tests. These large biases are mainly 
due to reflections. Exploiting geometric properties of reflections may reduce them. 
Reflections modify angles and ranges. 

Exact positions from reliable estimates or positions of the mobile as seen by the basestation can 
be determined from the building positions and heights. 

However, taking into account reflections increases the region of uncertainty of estimates. Using 
several reflections for the same test may reduce this uncertainty. 




Figure 2: True and estimated (underlined) positions for the stationary microcell tests 
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The five moving tests all followed the same route (see Figure 4). Here, the results are very similar, 
showing that the experiment is repeatable. After 400 seconds, the mobile is out of range since its 
angle of arrival is less than -60 degrees. Only the first parts of the tests are thus considered. 
Figure 3 represents the DOA and range estimates for the test 277. In the histogram computation, all 
representative estimates were kept and the averaging filter has not been applied. 

DOA estimates present clear changes of slopes. They may correspond to different reflections and 
could be solved with a map, indicating building positions and heights. Range estimate slopes also 
present discontinuities that are not always easily associated with the reflections defined from the 
angle estimates and may be involved by the time drift inside the mobile terminal. 

The lines in Figure 3 indicate the changes in the slopes of the DOA for the test 277. The same lines 
are drawn for the range estimates. The positions on the trajectory at the corresponding times are 
represented by large dots in Figure 4. Many of them are explained by the building locations, even 
without knowing their heights. With a three-dimensional map, more precise estimates of the mobile 
location may be achievable. 

Figure 5 indicates the position estimates derived for a few fixed locations in more detail. The 
influence of multipath propagation from surrounding buildings is clearly evident. 




Figure 3: True (solid line) and estimated (dotted line) DOA and ranges for the test 277 
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Figure 4: Route for the moving micro tests. 
Points on the trajectory correspond to the lines in Figure 3 



Conclusions 

A technique for estimating the location of a mobile radio transmitter, using an antenna array at a 
single receiving point has been presented. Such a technique could be used to aid cell planning in a 
cellular network by mapping traffic density within a coverage area, or as a solution to the US 91 1 
location requirement. 

The technique has been tested using the TSUNAMI (II) macrocell and microcell field trial data. The 
macrocell results show that the mobile can be located to within a circle of radius 125m for 
approximately 75% of the time. It is believed that a large component of the position error is due to 
the timing uncertainty within the mobile itself, which can be up to one quarter of a bit period. 

The microcell environment is clearly more challenging, because the perceived direction of arrival is 
often significantly different from the true bearing of the mobile station, resulting in biased estimates 
of DO A and range. The overall position accuracy in the microcell tests appeared to be better than 
the macrocell case. However, this is not a fair comparison since the position accuracy for the 
microcell was only evaluated for a limited number of stationary tests. In the moving microcell tests, 
sudden changes in the channel (the so called 'comer effect') can be clearly seen in the DOA and 
range estimates and in some cases these changes can be clearly related to the positions of buildings 
within the test environment. Nonetheless, the microcell results demonstrate that the technique has 
promise but requires further refinement to ameliorate the influence of multipaths. 
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Data Gathering Algorithms in 
Sensor Networks Using Energy Metrics 

Stephanie Lindsey, Cauligi Raghavendra, Fellow Member, IEEE, and 
Krishna M. Sivalingam, Senior Member, IEEE 

Abstract— Sensor webs consisting of nodes with limited battery power and wireless communications are deployed to collect useful 

information from the field. Gathering sensed information in an energy efficient manner is critical to operating the sensor network for a 

long period of time. In [12], a data collection problem is defined where, in a round of communication, each sensor node has a packet to 

be sent to the distant base station. There is some fixed amount of energy cost in the electronics when transmitting or receiving a packet 

and a variable cost when transmitting a packet which depends on the distance of transmission. If each node transmits its sensed data 

directly to the base station, then it will deplete its power quickly. The LEACH protocol presented in [12] is an elegant solution where 

clusters are formed to fuse data before transmitting to the base station. By randomizing the cluster-heads chosen to transmit to the 

base station, LEACH achieves a factor of 8 improvement compared to direct transmissions, as measured in terms of when nodes die. 

An improved version of LEACH, called LEACH-C, is presented in [14], where the central base station performs the clustering to 

improve energy efficiency. In this paper, we present an improved scheme, called PEGASIS (Power- Efficient GAthering in Sensor 

Information Systems), which is a near-optimal chain-based protocol that minimizes energy. In PEGASIS, each node communicates 

only with a close neighbor and takes turns transmitting to the base station, thus reducing the amount of energy spent per round. 

Simulation results show that PEGASIS performs better than LEACH by about 100 to 200 percent when 1 percent, 25 percent, 

50 percent, and 100 percent of nodes die for different network sizes and topologies. For many applications, in addition to minimizing 

energy, it is also important to consider the delay incurred in gathering sensed data. We capture this with the energy x delay metric and 

present schemes that attempt to balance the energy and delay cost for data gathering from sensor networks. Since most of the delay 

factor is in the transmission time, we measure delay in terms of number of transmissions to accomplish a round of data gathering. 

Therefore, delay can be reduced by allowing simultaneous transmissions when possible in the network. With CDMA capable sensor 

nodes [11], simultaneous data transmissions are possible with little interference. In this paper, we present two new schemes to j 

minimize energy x delay using CDMA and non-CDMA sensor nodes. If the goal is to minimize only the delay cost, then a binary A> 

combining scheme can be used to accomplish this task in about log N units of delay with parallel communications and incurring a slight 

increase in energy cost. With CDMA capable sensor nodes, a chain-based binary scheme performs best in terms of energy x delay. If \jT 

the sensor nodes are not CDMA capable, then parallel communications are possible only among spatially separated nodes and a / ft » 

chain-based 3-level hierarchy scheme performs well. We compared the performance of direct, LEACH, and our schemes with respect jf* J ' 

to energy x delay using extensive simulations for different network sizes. Results show that our schemes perform 80 or more times a$ Vj^? 

better than the direct scheme and also outperform the LEACH protocol. /, \ J W ' 



Index Terms— Wireless sensor networks, data gathering protocols, energy-efficient operation, greedy algorithms, performance jyf^ V 
evaluation. (1° \ 

y 



1 Introduction 

Inexpensive sensors capable of significant computation and protocols are developed in [18], [19]. f Each node has 

wireless communications are becoming available [4], [6], transmit power cont rol and an omni-directional antenn a, 

[8], [10], [16], [23]. A web of sensor nodes can be deployed and therefore can adjust the area of coverage with its 

to collect useful information from the field in a variety of " wireless transmission. T ypically, sensor nodes collect audio, 

scenarios including military surveillance, landmine detec- seismic, and other types of data and collaborate to perform 

Hon, in harsh physical environments, for scientific investi- a high-level task in a sensor web. For example, a sensor 

gations on other planets, etc. [1], [10], [16], [29]. These network can be used for detecting the presence of potential 

sensor nodes can self-organize to form a network and can threats in a military conflict. Since wireless communications 

communicate with each other using their wireless inter- consume significant amounts of battery power, sensor 

faces. Energy efficient self-organization and initialization nodes should be energy efficient in transmitting data [3], 

[17], [25], [27]. Energy efficient communication in wireless 

networks is attracting increasing attention in the literature 

• S. lindsey is with Microsoft Corporation, Redmond, WA 98052. [5] [22] [241 [281 [30] 

* i^^Si^^^"^^ ' A ^ical'appHcation in a sensor web is gathering of 
the Computer Systems Research Department, The Aerospace Corporation, sensed data at a distant base station (BS) [12]. Fig. 1 shows a 
PO Box 92957, Los Angeles, CA 90009-2957, E-mail: raghu@usc.edu 100-node sensor network in a playing field of size 

. KM. Sivalingam is with the School of Electrical ^^'^^^"^ ^ om P M ' er 50m x 50m. There is an energy cost for transmitting or 

Science, Washington State University, Pullman, WA 99164-2752. . . . ° J . °. 

0 receiving a packet in the radio electronics and there is a 

tr^X!£^n^T^L A Z^ mi — to: variable energy cost depending on the distance in transmis- 

tpds@compuler.org, and reference IEEECS Log Number 116182. sions. Due to the r 1 .or larger radio signal attenuation tor a 
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20 30 
X -coordinates 

Fig. 1. Random 100-node topology for a 50m x 50m network. The base 
station (BS) is assumed to be located at (25, 150), which is at least 
100m from the nearest node. 

range r, it is important to limit transmission distances to 
conserve energy. 

In this paper, we assume the following: 

• Each s ensor node has power control and the ability 
• to transmit data to any other sensor node or directly 

tt rth e BS [ 20 ] , [22]. = 

• Our model sensor network contains homogeneous 
and energy constrained sensor nodes with initial 
uniform energy. 

• Every node has location information. 

• There is no mobility. 

1.1 Energy Reduction for Data Gathering in Sensor 
Networks 

In each round of this data-gathering application, all data 
from all nodes need to be collected and transmitted to the 
BS, where the end-user can access the data. In some sensor 
network applications, data collection may be needed only 
from a region and, therefore, a subset of nodes will be used. 
A simple approach to accomplishing this data gathering 
task is for each node to transmit its data directly to the BS. 
Since the BS is typically located far away, the cost to 
transmit to the BS from any node is high so nodes will die 
very quickly. Therefore, an improved approach is to use as 
few transmissions as possible to the BS and reduce the 
amount of data that must be transmitted to the BS in order 
to reduce energy. Further, if all nodes in the network 
deplete their energy levels uniformly, then the network can 
operate without losing any nodes for a long time. 

In sensor networks, data fusion helps to reduce the 
amount of data transmitted between sensor nodes and the 
BS [9], [15], [31]. Data fusion combines one or more data 
packets from different sensor measurements to produce a 
single packet, as described in [12]. For example, sensors 
may collect temperature, pressure, humidity, and signal 
data from the field. We would be interested in finding the 
maximum or minimum values of such parameters. Data 
fusion can be used here to combine one or more packets to 
produce a same-size resultant packet. The LEACH protocol 
presented in [12] is an elegant solution to this data 
collection problem where a small number of clusters are 
formed in a self-organized manner. The nice property of the 
LEACH protocol is that it is completely distributed and 
sensor nodes organize in a cluster hierarchy to fuse their 
data to eventually transfer to the BS. In LEACH, a 
designated node in each cluster collects and fuses data 
from nodes in its cluster and transmits the result to the BS. 
LEACH uses randomization to rotate the cluster heads and 



achieves a factor of eight improvement compared to the 
direct approach, before the first node dies. 

In LEACH, clusters are formed in a self-organized 
manner in each round of data collection. About 5 percent 
of the nodes in the network selected randomly become 
cluster heads. These cluster heads send a strong beacon 
signal to all nodes and sensor nodes decide which cluster 
to join based on received signal strength. The distributed 
cluster formation in each round in LEACH may not 
produce good clusters to be efficient. In an improved 
version of this scheme, called LEACH-C [14], this cluster 
formation is done at the beginning of each round using a 
centralized algorithm by the BS. Although the energy cost 
for cluster formation is higher in LEACH-C, the overall 
performance is better than LEACH due to improved 
cluster formation by the BS. The steady state part of the 
LEACH-C protocol, i.e., data collection in rounds, is 
identical to the LEACH protocol (p. 94 in [14]). LEACH-C 
improves the performance by 20 percent to 40 percent (p. 
97 in [14]), depending on the network parameters, 
compared to LEACH in terms of the total number of 
rounds of data collection that can be achieved before 
sensor nodes start to die. 

Further improvements can be obtained if each node 
communicates only with close neighbors and only one 
designated node sends the combined data to the BS in each 
round in order to reduce energy. A new protocol based on 
this approach, called PEGASIS (Power-Efficient GAthering 
in Sensor Information Systems), is presented in this paper, 
which significantly reduces energy cost to increase the life 
of the sensor network. The PEGASIS protocol is near 
optimal in terms of energy cost for this data gathering 
application in sensor networks. The key idea in PEGASIS is 
to form a chain among the sensor nodes so that each node 
will receive from and transmit to a close neighbor. Gathered 
data move from node to node, get fused, and, eventually, a 
designated node transmits to the BS. Nodes take turns 
transmitting to the BS so that the average energy spent by 
each node per round is reduced. Building a chain to 
minimize the total length is similar to the traveling 
salesman problem, which is known to be intractable. 
However, with the radio communication energy para- 
meters, a simple chain built with a greedy approach 
performs quite well. The PEGASIS protocol achieves 
between 100 to 200 percent improvement when 1 percent, 
25 percent, 50 percent, and 100 percent of nodes die 
compared to the LEACH protocol. PEGASIS performance 
improvement in comparison with LEACH-C will be slightly 
less as LEACH-C improves upon LEACH by about 
20 percent to 40 percent. In the rest of this paper we 
present all our performance comparisons with respect to the 
LEACH protocol with the understanding that the improve- 
ment is less by the extent that LEACH-C improves upon 
LEACH [14]. When attribute-based search is to be per- 
formed, then the area and, hence, selected sensor nodes, 
will also change dynamically. In these situations, the BS 
selects the area of interest and only selected nodes in the 
region participate in data collection. We will still use the 
same chain ordering of nodes and only the selected nodes 
will be on to form the truncated chain. Likely, these nodes 
will still be nearby on the shortened chain and the data 
collection will still be efficient. 

Our scheme can be modified appropriately if some of the 
stated assumptions about sensor nodes are not valid. If 
nodes are not within transmission range of each other, then 
alternative, possibly multihop transmission paths will have 
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to be used. In fact, our chain-based schemes will not be 
affected that much as each node communicates only with a 
local neighbor and we can use a multihop path to transmit 
to the BS. We need to make some adjustments in the chain 
construction procedure to ensure that no node is left out. 
Other schemes, including LEACH, rely on direct reach- 
ability to function correctly. To ensure balanced energy 
dissipation in the network, an additional parameter could 
be considered to compensate for nodes that must do more 
work every round. If the sensor nodes have different initial 
energy levels, then we could consider the remaining energy 
level for each node in addition to the energy cost of the 
transmissions. The assumption of location information is 
not critical. The BS can determine the locations and transmit 
to all nodes or the nodes can determine this through 
received signal strengths. For example, nodes could 
transmit progressively reduced signal strengths to find a 
close neighbor to exchange data. This would require the 
nodes to consume some energy when trying to find local 
neighbors; however, this is only a fixed initial energy cost 
when constructing the chain. If nodes are mobile, then 
different methods of transmission could be examined. For 
instance, if nodes could approximate how often and at what 
speed other nodes are moving, then it could determine 
more intelligently how much power is needed to reach the 
other nodes. Perhaps, the BS can help coordinate the 
activities of nodes in data transmissions. Discussion of 
schemes with mobile sensor nodes is beyond the scope of 
this paper. 

1.2 Energy x Delay Reduction for Data Gathering in 
Sensor Networks 

Another important factor to consider in the data gathering 
application is the average delay per round. Here, we 
assume that data gathering rounds are far apart and the 
only traffic in the network is due to sensor data. Therefore, 
data transmissions in each round can be completely 
scheduled to avoid delays in channel access and collisions. 
The delay for a packet transmission is dominated by the 
transmission time as there is no queuing delay and the 
processing and propagation delays are negligible compared 
to the transmission time. With the direct transmission 
scheme, nodes will have to transmit to the base station one 
at a time, making the delay a total of N units (one unit per 
transmission, where N is equal to the number of nodes). To 
reduce delay, one needs to perform simultaneous transmis- 
sions. The well-known approach of using a binary scheme 
to combine data from N nodes in parallel will take about 
log N units of delay, although incurring an increased energy 
cost. Energy x delay is an interesting metric to optimize per 
round of data gathering in sensor networks. 

Why energy x delay metric? Clearly, minimizing energy 
or delay in isolation has drawbacks. For battery operated 
sensors, longevity is a major concern and priorities can be 
entirely different when energy reserves become depleted. 
Energy efficiency often brings additional latency along with 
it. Minimizing delay is not always practical in sensor 
network applications. Maximizing the throughput is not the 
best strategy for energy-critical links. Generally, increased 
energy savings come with a penalty of increased delay. 
However, several practical applications set limits on 
acceptable latency, as specified by QoS requirements. For 
example, the data gathering delay per round may have a 
bound. Therefore, there is a tradeoff between energy spent 
per packet and delay; energy x delay is an appropriate 



measure to optimize for in wireless sensor networks. 
Specifically, our view is that minimizing energy x delay 
while meeting acceptable delays for applications can lead to 
significant power savings. 

Simultaneous wireless communications among pairs of 
nodes is possible only if there is minimal interference 
among different transmissions. CDMA technology can be 
used to achieve multiple simultaneous wireless transmis- 
sions with low interference. If the sensor nodes are CDMA 
capable, then it is possible to use the binary scheme and 
perform parallel communications to reduce the overall 
delay. However, the energy cost may have to go up slightly 
as there will still be a small amount of interference from 
other unintended transmissions. Alternatively, with a single 
radio channel and non-CDMA nodes, simultaneous trans- 
missions are possible only among spatially separated nodes. 
Since the energy costs and delay per transmission for these 
two types of nodes are quite different, we will consider 
energy x delay reduction for our data gathering problem 
separately for these two cases. 

In this paper, we present the following new protocols for 
data gathering using the energy x delay metric: 

• a binary chain-based scheme with CDMA sensor 
nodes, 

• a three level chain-based scheme which performs 
better than direct and PEGASIS with this metric for 
non-CDMA sensor nodes. 

Both of these protocols use hierarchical organization of 
sensor nodes with possible simultaneous data transmis- 
sions in each level of the hierarchy. A greedy chain is 
formed among the sensor nodes in both of these protocols 
which will form the lowest level in the hierarchy. The 
binary scheme has a hierarchy of pogi\f|, where N is the 
number of nodes in the sensor network. The second 
protocol uses a 3-level hierarchy by forming groups in 
each level and promoting one node from each group to the 
next level. Simulation results show that both schemes 
perform 80 or more times better than direct scheme and the 
binary scheme performs eight times better than LEACH 
with respect to the energy x delay metric. 

This paper is organized as follows: In Section 2, the radio 
model for energy calculations used throughout this paper is 
discussed. In Section 3, an analysis of the energy cost is 
given for the data gathering problem. The PEGASIS scheme 
is presented in Section 4, which is shown to be a near- 
optimal solution for minimizing energy. In Section 5, an 
analysis of the energy x delay metric for data gathering is 
given. Two new protocols for reducing energy x delay for 
data gathering with and without CDMA capable nodes are 
presented in Sections 6 and 7, respectively. Extensive 
simulation results with different size networks and simula- 
tion parameters are presented in Section 8. In all our 
simulation experiments, we considered only the original 
LEACH protocol and our proposed new protocols. The 
performance improvements with respect to LEACH-C will 
be slightly less corresponding to the extent LEACH-C 
improves upon LEACH. Finally, some concluding remarks 
are given in Section 9. 

2 Radio Model for Energy Calculations 

We use the same radio model as discussed in [12], which is 
the first order radio model. In this model, a radio dissipates 
Eeiec = 50nJ/bit to run the transmitter or receiver circuitry 
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and e amp = l00pJ/bit/m 2 for the transmitter amplifier. The 
radios have power control and can expend the nunifrmm 
r equired energy to reach the ml u ud cd recipients. The radios 
can be turned ori' to avoid receiving unintended transmis- 
sions. An r 2 energy loss is used due to channel transmission 
[21], [26]. The equations used to calculate transmission costs 
and receiving costs for a Ac-bit message and a distance d are 
shown below: 

.O.a Transmitting 

E T x{Kd) = E Tx - e iec{k) + E Tx - amv {k,d) 

E Tx {h d) = E elec x k + e amp x k x d 2 
.O.b Receiving 

Eteik) =E Rx - elec (k) 

E^k) = E dec x k 

Receiving data is also a high cost operation, therefore, 
the number of receptions and transmissions should be 
minimal to reduce the energy cost of an application. With 
these radio parameters, when k = 2, 000 and d 2 is 500, the 
energy spent in the amplifier part equals the energy spent in 
the electronics part and, therefore, the cost to transmit a 
packet will be twice the cost to receive. It is assumed that 
the radio channel is symmetric so that the energy required 
to transmit a message from node i to node j is the same as 
the energy required to transmit a message from node j to 
node A for a given signal-to-noise ratio (SNR), typically 
10 dB. For the comparative evaluation purposes of this 
paper, we assume that there are no packet losses in the 
network. It is not difficult to model errors and losses in 
terms of increased energy cost per transmissions. With 
known channel error characteristics and error coding, this 
cost can be modeled by suitably adjusting the constants in 
the above equations. 

When there are multiple simultaneous transmissions, the 
transmitted energy should be increased to ensure that the 
same SNR as with a single transmission is maintained. With 
CDMA nodes using 64 or 128 chips per bit (which is 
typical), the interference from other transmissions is 
calculated as a small fraction of the energy from other 
unintended transmissions. This effectively increases the 
energy cost to maintain the same SNR. With non-CDMA 
nodes, the interference will equal the amount of energy seen 
at the receiver from all other unintended transmitters. 
Therefore, only a few spatially distant pairs can commu- 
nicate simultaneously in the network. 

3 Energy Cost Analysis for Data Gathering 

In this section, we will analyze the energy cost of data 
gathering from a sensor web to the distant BS. Recall that 
the data collection problem of interest is to gather a /c-bit 
packet from each sensor node in each round. Of course, the 
goal is to keep the sensor web operating as long as possible. 
A fixed amount of energy is spent in receiving and 
transmitting a packet in the electronics and an additional 
amount proportional to cP is spent while transmitting a 
packet. TTiere is also a cost of 5 nj/bit/message for 2,000 bit 
messages in data fusion. With the direct approach, all nodes 
transmit directly to the BS, which is usually located at some 



distance from the sensor network. Therefore, every node 
will consume a significant amount of power to transmit to 
the BS in each round. Since the nodes have a limited 
amount of energy, nodes will die quickly, causing the 
reduction of the system lifetime. 

As observed in [12], the direct approach would work best 
if the BS is located close to the sensor nodes or the cost of 
receiving is very high compared to the cost of transmitting 
data. For the rest of the analysis, we use 50, 100, and 200- 
node sensor networks. In a scenario where the BS is located 
far away, energy costs can be reduced if the data is gathered 
locally among the sensor nodes and only a few nodes 
transmit the fused data to the BS. This is the approach taken 
in LEACH and its variants, where clusters are formed 
dynamically in each round and cluster-heads (leaders for 
each cluster) gather data locally and then transmit to the BS. 
Cluster-heads are chosen randomly, but all nodes have a 
chance to become a cluster-head in LEACH to balance the 
energy spent per round by each sensor node. For a 100-node 
network in a 50m x 50m field with the BS located at 
(25, 150), which is at least 100 meters from the closest node, 
LEACH achieves a factor of 8 improvement compared to 
the direct approach in terms of number of rounds before the 
first node dies. 

Although this approach is significantly better than the 
direct transmissions to the BS, there is still some room to 
save even more energy. The cost of the overhead to form the 
clusters in LEACH is expensive. In LEACH, in every round, 
five percent of nodes are cluster-heads and these nodes 
must broadcast a signal to reach all nodes to determine the 
members in their clusters. This overhead has been elimi- 
nated in the improved version, LEACH-C [14]; otherwise, 
LEACH-C is identical to LEACH in collection of data in 
each round. However, several cluster-heads, typically five 
in a network of 100 nodes, transmit the fused data from the 
cluster to the distant BS. Further improvement in energy 
cost for data gathering can be achieved if only one node 
transmits to the BS per round and if each node transmits 
only to local neighbors in the data fusion phase. This is 
exactly what is done in the PEGASIS protocol (defined in 
Section 4) to obtain an additional factor of two or more 
improvement compared to LEACH and LEACH-C. 

For the 100-node network shown in Fig. 1, we can 
determine a bound on the maximum number of rounds 
possible before the first node dies. In each round, every 
node must transmit their packet and some node or the BS 
must receive it. So, each node spends two times the energy 
cost for electronics and some additional cost, depending on 
how far a node transmits its data. Since at least one node 
must transmit the fused message to the BS in each round, on 
the average each node must incur this cost at least once 
every 100 rounds. With the energy cost parameters and the 
dimensions of the playing field in Fig. 1 with 100 nodes and 
2,000 bit messages, we can calculate the maximum rounds 
possible. The energy spent in each node for 100 rounds is 
about 100*0.0002 joules for the electronics and at least 0.002 
joules for one message transmission to the BS. With an 
initial energy in each node of .25 joules, the maximum 
number of rounds possible before a node dies is given by: 
(100 x 0.25)/0.022 « 1,100. 

The actual number of rounds achievable before a node 
dies will be less since we did not account for the energy 
spent in the variable part of transmissions, which depends 
on the distance of transmission and the cost for data fusion. 
Since each node needs to transmit its data at least to its 
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closest neighbor, there can be about five to 10 percent more 
energy cost per round. The exact value clearly depends on 
the distribution of nodes in the network. Therefore, the 
upper bound will likely be less than 1,000 rounds. The 
PEGASIS protocol achieves about 800 rounds, which will 
likely be within 15-20 percent of this upper bounds, and 
therefore can be claimed to be near optimal. The following 
section presents the details of the PEGASIS protocol. 

4 PEGASIS: Power-Efficient Gathering in 
Sensor Information Systems 

The main idea in PEGASIS is for each node to receive from 
and transmit to close neighbors and take turns being the 
leader for transmission to the BS. This approach will 
distribute the energy load evenly among the sensor nodes 
in the network. We initially place the nodes randomly in the 
playing field and, therefore, the ith node is at a random 
location. The nodes will be organized to form a chain, 
which can either be computed in a centralized manner by 
the BS and broadcast to all nodes or accomplished by the 
sensor nodes themselves using a greedy algorithm. If the 
chain is computed by the sensor nodes, they can first get all 
sensor nodes location data and locally compute the chain 
using the same greedy algorithm. Since all nodes have the 
same location data and run the same algorithm, they will all 
produce the same result. We used random 50, 100, and 200- 
node networks for our simulations with similar parameters 
used in [12]. Since this chain computation is done once, 
followed by many rounds of data communication (typically, 
several hundred rounds, as shown later), the energy cost in 
this overhead is small compared to the energy spent in the 
data collection phase. Therefore, in comparing various 
schemes, we only consider the energy cost for data 
collection, fusion, and transmission to the BS and evaluate 
when the first node dies. With our assumption of no 
mobility, there will be no change in the chain in the case of 
PEGASIS and no change in clusters in LEACH-C until the 
first node dies. 

For constructing the chain, we assumed that all nodes 
have global knowledge of the network and employed the 
greedy algorithm. We could have constructed a loop. 
However, to ensure that all nodes have close neighbors is 
difficult as this problem is similar to the traveling salesman 
problem. The greedy approach to constructing the chain 
works well and this is done before the first round of 
communication. To construct the chain, we start with the 
furthest node from the BS (select a node randomly if there is 
a tie). The closest neighbor to this node will be the next node 
on the chain. Successive neighbors are selected in this 
manner among unvisited nodes (with ties broken arbitra- 
rily) to form the greedy chain. We begin with the farthest 
node in order to make sure that nodes farther from the BS 
have close neighbors as, in the greedy algorithm, the 
neighbor distances will increase gradually since nodes 
already on the chain cannot be revisited. Fig. 2 shows 
node cO connecting to node cl, node cl connecting to node 
c2, and node c2 connecting to node c3, in that order. When a 
node dies, the chain is reconstructed in the same manner to 
bypass the dead node. 

For gathering data from sensor nodes in each round, 
each node receives data from one neighbor, fuses the data 
with its own, and transmits to the other neighbor on the 
chain. Note that node i will be in some random position j on 
the chain. Nodes take turns transmitting to the BS and we 
will use node number i mod N (N represents the number of 




BS 

Fig. 2. Chain construction using the greedy algorithm. 

nodes) to transmit to the BS in round i. Thus, the leader in 
each round of communication will be at a random position 
on the chain, which is important for nodes to die at random 
locations. The idea of nodes dying at random places is to 
make the sensor network robust to failures. 

Each round of data collection can be initiated by the BS 
with a beacon signal which will synchronize all sensor 
nodes. Since all nodes know their positions on the chain, we 
can employ a time slot approach for transmitting data. In 
the ith round of data collection, node c(i - 1) will be the 
leader. The end node cO will transmit its data to node cl in 
slot one, cl fuses and transmits data in slot two, and so on 
until the leader node is reached. In subsequent slots, data 
transmissions happen from the node c(N - 1) and move 
toward the leader node from the right end of the chain. 
Finally, in the Nth slot, the leader transmits data to the BS. 

Alternatively, in a given round, we can use a simple 
control token passing approach initiated by the leader to start 
the data transmission from the ends of the chain. The cost is 
very small since the token size is very small. In Fig. 3, node c2 
is the leader and it will pass the token along the chain first to 
node cO. Node cO will pass its data toward node c2. After node 
c2 receives data from node cl, it will pass the token to node c4, 
and node c4 will pass its data towards node c2 with data 
fusion taking place along the chain. 

PEGASIS performs data fusion at every node except the 
end nodes in the chain. Each node will fuse its neighbor's 
data with its own to generate a single packet of the same 
length and then transmit that to its other neighbor (if it has 
two neighbors). In the above example, node cO will transmit 
its data to node cl. Node cl fuses node cO's data with its 
own and then transmits to the leader. After node c2 passes 
the token to node c4, node c4 transmits its data to node c3. 
Node c3 fuses node c4's data with its own and then 
transmits to the leader. Node c2 waits to receive data from 
both neighbors and then fuses its data with its neighbors' 
data. Finally, node c2 transmits one message to the BS. 
Thus, in PEGASIS, each node, except the two end nodes and 
the leader node, will receive and transmit one data packet 
in each round and be the leader once every N rounds. In 
addition, nodes receive and transmit very small control 
token packets. 

With our simulation experiments, we found that the 
greedy chain construction performs well with different size 
networks and random node placements. In constructing the 
chain, it is possible that some nodes may have relatively 
distant neighbors along the chain. Such nodes will dissipate 
more energy in each round compared to other sensors. We 
improved the performance of PEGASIS by not allowing 
such nodes to become leaders. We accomplished this by 
setting a threshold on neighbor distance to be leaders. We 
may be able to slightly improve the performance of 
PEGASIS further by applying a threshold adaptive to the 
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Fig. 3. Token passing approach. 

remaining energy levels in nodes. Whenever a node dies, 
the chain will be reconstructed and the threshold can be 
changed to determine which nodes can be leaders. 

PEGASIS protocol improves on LEACH by saving 
energy in several stages. First, in the local gathering, the 
distances that most of the nodes transmit are much less 
compared to transmitting to a cluster-head in LEACH. 
Second, the amount of data for the leader to receive is at 
most two messages instead of 20 (20 nodes per cluster in 
LEACH for a 100-node network). Finally, only one node 
transmits to the BS in each round of communication. 

5 Energy x Delay ANALYSIS FOR DATA 

Gathering 

In this section, we will analyze the energy x delay cost per 
round for data gathering from a sensor web to the distant 
BS. The delay cost can be calculated as units of time. On a 
2Mbps link, a 2,000 bit message can be transmitted in lms. 
Therefore, each unit of delay will correspond to about Iras 
time for the case of a single channel and non-CDMA sensor 
nodes. The actual, delay value will be different with CDMA 
nodes, depending on die effective data rate. For each of the 
systems, we assume that the delay is one unit for each 2,000 
bit message transmitted. 

The energy x delay cost for data gathering in a network 
of N nodes will be different for the schemes considered in 
this paper and will depend on the node distribution in the 
playing field. Consider an example network where the 
N nodes are along a straight line with equal distance of d 
between each pair of nodes and the BS is a far distance from 
all nodes. The direct transmission to the BS scheme will 
require high energy cost and the delay will be N as nodes 
transmit to the BS sequentially. The PEGASIS scheme forms 
a chain among the sensor nodes so that each node will 
receive from and transmit to a close neighbor. For this linear 
network with equally spaced nodes, the energy cost in 
PEGASIS is minimized and the variable cost is proportional 
to N x d 2 and the delay will be N units. Therefore, the 
energy x delay cost will be N 2 x d 2 . 

In the binary scheme with perfect parallel transmission 
of data, there will be N/2 nodes transmitting data to their 
neighbors at distance d in the lowest level. The nodes that 
receive data will fuse the data with their own data and will 
be active in the next level of the tree. Next, iV/4 nodes will 
transmit data to their neighbors at a distance 2d and this 
procedure continues until a single node finally transmits the 
combined message to the BS. Thus, for the binary scheme, 
the energy cost will be: 

N/2 xd 2 + N/4 x (2d) 2 + N/S x (4d) 2 + . . . + 1 x {N/2 * d) 2 

since the distance doubles as we go up the hierarchy. In 
addition, there will be a single transmission to the BS and 
the energy cost depends on the distance to the BS. Without 
including this additional cost by simplifying the above 



expression we get for the energy cost for the binary 
scheme as: 

N/2 x d 2 x (1 + 2 4- 4 + . . . + N/2), 
which equals 

N(N~- 1)/2 x d 2 . 

With the additional transmission to the BS, N we can 
approximate the total energy cost for the binary scheme to be: 

N 2 /2 x d 2 . 

With the delay cost of about logN units, the energy x 
delay cost for the binary scheme is N 2 /2 x d 2 x logiV. 
Therefore, for this linear network, the binary scheme will 
be more expensive than PEGASIS in terms of 
energy x delay. For random distribution of nodes in a 
rectangular playing field, the distances do not double as 
we go up the hierarchy in the binary scheme and the 
reduced delay will help reduce the energy x delay cost. It is 
difficult to analyze this cost for randomly distributed nodes 
and we will use simulations to evaluate this cost. 

For the rest of the analysis, we assume 50, 100, and 200- 
node sensor networks in a square field with the BS located 
far away. In this' scenario, energy costs can be reduced if the 
data is gathered locally among the sensor nodes and only a 
few nodes transmit the fused data to the BS. This is the 
approach taken in LEACH [12], where clusters are formed 
dynamically in each round and cluster-heads (leaders for 
each cluster) gather data locally and then transmit to the BS. 
Cluster-heads are chosen randomly, but all nodes have a 
chance to become a cluster-head in LEACH to balance the 
energy spent per round by each sensor node. Nodes are 
able to transmit simultaneously to their cluster-heads using 
CDMA. For a 100-node network in a 50m x 50m field with 
the BS located at (25, 150), which is at least 100 meters from 
the closest node, LEACH reduces the energy x delay cost 
compared to the direct scheme. For the linear network of 
N nodes that are equally spaced, LEACH will have slightly 
higher energy compared to PEGASIS due to the cluster- 
heads transmissions to the BS and a delay of roughly N/c, 
where c is the number of clusters. With five clusters 
suggested in [12], the energy x delay for LEACH will be 
lower than for PEGASIS for a 50m x 50m network. 
However, for a 100m x 100m network, the energy x delay 
for LEACH will be higher than for PEGASIS since PEGASIS 
achieves increased energy savings with more 
sparse networks. 

The next two sections present protocols that are designed 
to minimize the energy x delay metric. 

6 A Chain-Based Binary Approach Using 
CDMA Capable Sensor Nodes 

First, we consider a sensor network with nodes capable of 
CDMA communication. With this CDMA system, it is 
possible for node pairs that communicate to use distinct 
codes to minimize radio interference. Thus, parallel com- 
munication is possible among 50 pairs for a 100-node 
network. In order to minimize the delay, we will combine 
data using as many pairs as possible in each level, which 
results in a hierarchy of [log N] levels. At the lowest level, 
we will construct a linear chain among all the nodes, as was 
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Fig. 4. Data gathering in a chain-based binary scheme. 

done in PEGASIS, so that adjacent nodes on the chain are 
nearby. For constructing the chain, we assume that all 
nodes have global knowledge of the network and employ 
the greedy algorithm described in Section 4. 

For gathering data in each round, each node transmits to 
a close neighbor in a given level of the hierarchy. This 
occurs at every level in the hierarchy, but the only 
difference is that the nodes that are receiving at each level 
are the only nodes that will be active in the next level. 
Finally, at the top level, the only node remaining will be the 
leader and the leader will transmit the k bit message to the 
BS. Note that node i will be in some random position j on 
the chain. Nodes take turns transmitting to the BS and we 
will use node number i mod N {N represents the number of 
nodes) to transmit to the BS in round i. In Fig. 4, for round 
three (first round is round zero), node c3 is the leader. Since, 
node c3 is in position 3 (counting from 0) on the chain, all 
nodes in an even position will send to their right neighbor. 
Now, at the next level, node c3 is still in an odd position, so, 
again, all nodes in an even position will fuse their data with 
its received data and send to their right. At the third level, 
node c3 is not in an odd position, so node cl will fuse its 
data and transmit to c3. Finally, node c3 will combine its 
current data with that received from cl and transmit the 
message to BS. 

The chain-based binary scheme performs data fusion at 
every node that is transmitting except the end nodes in each 
level. Each node will fuse its neighbor's data with its own to 
generate a single packet of the same length and then 
transmit that to the next node. In the above example, node cO 
will pass its data to node cl. Node cl fuses node cO's data 
with its own and then transmits to node c3 in the next level. 
In our simulations, we ensure that each node performs an 
equal number of sends and receives after N rounds of 
communication and each node transmits to the BS in one of 
N rounds. We then calculate the average energy cost per 
round, while the delay cost is the same for each round. We 
compute the average energy x delay cost over a number of 
different node distributions. Experimental results are 
presented in detail in Section 8. 

The chain-based binary scheme improves on LEACH 
and LEACH-C by saving energy and delay in several 
stages. At the lower levels, nodes are transmitting at shorter 
distances compared to nodes transmitting to a cluster-head 
in the LEACH protocol and only one node transmits to the 
BS in each round of communication. By allowing nodes to 
transmit simultaneously, the delay cost for the binary 
scheme decreases from that of LEACH by a factor of about 
three. While, in LEACH and LEACH-C, only five groups 
can transmit simultaneously for a 100-node network, here, 
at each level, we have more nodes transmitting simulta- 
neously. At each level of the binary scheme, transmissions 
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are simultaneous, making the total delay [logAT| + l, 
including the transmission to the BS. In LEACH and 
LEACH-C, the delay for 100-node networks will be 27 units. 
The delay for all nodes to transmit to the cluster-head is the 
max number of nodes in any of the five clusters. If all the 
clusters are of the same size, then the delay would be 19. 
Then, all five cluster-heads must take turns to transmit to 
the BS, making that a total of 24. For overhead calculations, 
we have one unit of delay for cluster formation, one unit of 
delay for all nodes to broadcast to the cluster-head its 
presence in that cluster, and, finally, one unit of delay for 
the cluster-head to broadcast a schedule sequence to the 
nodes so that all nodes within a cluster know when to 
transmit their data to the cluster-head. 

7 A Chain-Based Three Level Scheme without 
CDMA Capable Sensor Nodes 

CDMA may not be applicable for all sensor networks as 
these nodes can be expensive. Therefore, we need a protocol 
that will achieve a minimal energy x delay with non-CDMA 
nodes. It will not be possible to use the binary scheme in 
this case as the interference will be too much at lower levels. 
We either have to increase the energy cost significantly or 
take more time steps at lower levels of the hierarchy, both of 
which will lead to much higher energy x delay cost. There- 
fore, in order to improve energy x delay, we need a protocol 
that allows simultaneous transmissions that are far apart to 
minimize interference while achieving reasonable delay 
cost. Based on our experiments, we suggest the chain-based 
3-level scheme for data gathering in sensor networks with 
non-CDMA nodes. 

Also, in the 3-level scheme, we start with the linear chain 
among all the nodes and divide them into G groups, with 
each group having N/G successive nodes of the chain. 
Therefore, we will have G groups of N/G nodes. One node 
from each group will be active in the second level and, thus, 
there will be G nodes. These G nodes in the second level are 
divided into two groups of successive nodes in order to 
maintain only three levels in the hierarchy. G is calculated 
based on the number of nodes and the size of the network. 
For a 100m x 100m network, we found that, when G is 
equal to 10, we get the best balance for energy and delay. In 
a 100-node network, therefore, only 10 simultaneous 
transmissions take place at the same time and data fusion 
takes place at each node (except the end nodes in each 
level). The transmissions are also far enough apart that 
there is minimal interference and we can still maintain low 
energy costs at each level in the hierarchy while maintain- 
ing a low delay. Fig. 5 shows an example of this scheme 
with 100 nodes. We will have a different leader in each 
round transmit to the BS to evenly distribute the workload 
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Fig. 5. Chain-based 3-level scheme for a sensor network with non-CDMA nodes. 
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among the sensor nodes. As before, we will use node i 
along the chain to be the leader in the ith round of 
communication. We find the index i within a group which 
will represent the leader position modulo N / G. 

In Fig. 5, node cl8 is our leader. Then all nodes will send 
their data in the direction of index 8 within their group 
since 18 modulo 10 is 8. The delay at the first level is nine 
units. Then the second level will contain nodes 
c8, cl8, c28 . . . c98. These 10 nodes will be divided into two 
groups. If we have more levels in the hierarchy, then 
distances between nodes become further apart, causing 
higher energy costs. By experimentation, for the networks 
under consideration, having three levels gives us the best 
balance of energy and delay. Since the leader position is 18, 
all nodes that are in the first group will send down the 
chain 10 positions from its own position on the chain. So, 
node c48 will send to node c38, and node c38 will send to 
node c28 and so on. Since node c8's position is less than 
node cl8's, node c8 will transmit to a position that is N/G 
greater than its own. In group two, nodes know in which 
direction to send the data using the leader position N/2. So, 
here, the nodes in group two would send in the direction of 
node c68 in the same manner as in group one. This gives us 
a delay of four units for the second level. In the third level, 
node c68 transmits to node cl8, who is our leader, and then, 
finally, node cl8 transmits the combined packet to the BS, 
giving us a total delay of 15 units. The transmission 
schedule can be programmed once at the beginning so that 
all nodes know, where to send data in each round of 
communication. 

8 Experimental Results 

This section presents the performance analysis of the 
different protocols using simulation programs written in 
C programming language. We used several simulation 
parameter variations to test our schemes. The network 
dimensions studied were 50m x 50m and 100m x 100m. 
The BS locations were varied at (50, 150), (50, 200), and 
(50, 300). The packet sizes considered were 2,000, 10,000, 
and 20,000 bits. The number of nodes were varied as 50, 100, 
and 200 to test for dense and sparse networks. Extensive 
simulations were run to determine the optimal number of 
clusters to use when the number of nodes varied for the 
LEACH protocol. The LEACH protocol uses five clusters 
for a 100-node network. We found that, for a 200-node 
network, five clusters were optimal, and, for a 50-node 
network, two clusters were optimal. 

8.1 Comparison of LEACH and PEGASIS Using the 
Energy Metric 

For this experiment, the metric studied was the number of 
rounds of communication achieved when 1 percent, 



25 percent, 50 percent, and 100 percent of the nodes die 
using direct transmission, LEACH, and PEGASIS. Each 
node is assumed to have the same initial energy level of 
0.25J. Once a node dies due to battery power depletion, it is 
not recharged for the rest of the simulation. LEACH-C 
improves upon LEACH by about 40 percent due to the 
centralized computation by the BS to find better clusters 
[14]. Therefore, as stated before, in the rest of this section, 
we present our comparison results only with LEACH. The 
performance improvements will be correspondingly lesser 
compared to LEACH-C to the extent LEACH-C improves 
upon LEACH, which is about 20 percent to 40 percent, 
depending on network parameters [14]. 

Fig. 6 shows the number or rounds until 1 percent, 
25 percent, 50 percent, and 100 percent nodes die for a 
50m x 50m network. PEGASIS is approximately two times 
better than LEACH in all cases for a 50m x 50m network. 
The overhead energy cost in forming clusters in LEACH or 
chain in PEGASIS are similar. It may be more useful to 
compute this centrally in the BS, which doesn't have an 
energy limitation. The improvements in PEGASIS come due 
to fewer nodes transmitting data to BS in each round 
compared to LEACH and its variants. 

The next set of experiments were conducted for 
a 100m x 100m network. Fig. 7 shows the number of 
rounds completed for the same percentages of node deaths 
with different locations of the BS. The BS locations are at 
(50, 150), (50, 200), and (50, 300). 

The simulation results show that PEGASIS achieves: 

• approximately two times the number of rounds 
compared to LEACH when 1 percent, 25 percent, 50 
percent, and 100 percent of nodes die for 
a 50m x 50m network, 

• approximately three times the number of rounds 
compared to LEACH when 1 percent, 25 percent, 50 
percent, and 100 percent nodes die for 
a 100m x 100m network, 

• balanced energy dissipation among the sensor nodes 
to have full use of the complete sensor network, 

• near-optimal performance. 

However, there are some rare cases when the first node 
death occurs with PEGASIS slightly earlier in comparison to 
LEACH, as shown in Fig. 7a. This is due to the greedy chain 
construction procedure used, where a node may have a 
local neighbor very far away and thus will deplete energy 
more quickly and die first. This happens only for some 
distribution of nodes and an approach to ensure that 
PEGASIS always performs best before the first node death 
occurs is to construct a chain so that all nodes have 
relatively close neighbors. To construct such a chain 
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Fig. 6. Performance results for a 50m x 50m network, BS location at 
(25, 150), and 100 nodes. 

requires the use of the global knowledge of all node 
positions to pick suitable neighbors and minimize the 
maximum neighbor distance. This problem is related to the 
traveling salesman problem of minimizing the total length 
of the loop (chain), which is known to be intractable. 
Heuristic algorithms to solve this problem can be expensive 
compared to the simple scheme used in PEGASIS and the 
advantages are minimal as PEGASIS is nearly optimal in 
terms of rounds achievable when a larger percentage of 
nodes die. 



8.2 Comparison of All Schemes Using the 

Energy x Delay Metric 
To evaluate the performance of the chain-based binary 
scheme and the chain-based 3-level scheme, we simulated 
direct transmission, PEGASIS, LEACH, and the two new 
schemes using several random 50, 100, and 200-node 
networks with CDMA nodes and non-CDMA nodes. We 
used the same simulation parameters as described above for 
evaluating PEGASIS. However, instead of running the 
simulations for percentage of node deaths, we ran the 
simulations for enough rounds in all the schemes so that all 
N nodes had a chance to become leader only once. Since 
different schemes have to run for a different number of 
rounds before every node has a chance to become leader 
only once, it does not make sense to compare the number of 
rounds before nodes die. By doing this, we can compare the 
average energy costs per round for all the schemes fairly. 
We then used these costs to determine the average energy 
cost per round of data gathering for several different 
topologies. To calculate the energy x delay for these 
schemes, we multiply the average energy cost per round 
to the unit delay for the scheme. In both CDMA and non- 
CDMA systems, we included the interference costs when 
there are simultaneous transmissions to ensure that the 
same SNR of 10 dB is maintained as with single transmis- 
sion. For the 3-level scheme, we evaluated the number of 
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Fig. 7. Performance results for a 100m x 100m network with the BS location at (a) pos = (50, 150), (b) pos = (50, 200), and (c) pos = (50, 300). The 
packet size is 2,000 bits and the number of nodes is 100. 
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TABLE 1 

Energy x delay Cost for Direct, PEGASIS, LEACH, 
Chain-Based Binary Scheme and the Chain-Based Three Level Scheme 



Protocol 


Energy 


Delay 


Energy x Delay 


D = 50 


D= 100 


D = 50&D= 100 


D = 50 


D= 100 


Direct 
(both systems) 


0.3299 


1.2805 


100 


32.9938 


128.0459 


PEGASIS 
(both systems) 


0.0240 


0.0361 


100 


2.4008 


3.6107 


LEACH 
(CDMA nodes) 


0.0797 


0.2048 


27 


2.1518 


5.5292 


Chain-based binary 
(CDMA nodes) 


0.0318 


0.0559 


8 


0.2547 


0.4516 


Chain-based 3 level 
(non-CDMA nodes) 


0.0358 


0.0583 


15 


0.5365 


0.8743 



These results are for a 50m x 50m and 100m x 100m network where D equals the dimension of a D x D network. 



groups for the first level when the number of nodes change 
in the network to guarantee the optimal energy x delay. We 
found that, for 50-node, 100-node, and 200-node networks, 
dividing the nodes into 10 groups gave us the optimal 
energy x delay. 

Table 1 gives the results for energy cost, delay cost, and 
energy x delay cost for direct, PEGASIS, LEACH, the chain- 
based binary scheme, and the chain-based 3-level scheme. 

Fig. 8 shows the results for the five schemes based on 
different BS locations. Energy x delay is higher for all 
schemes as the BS moves farther away from the nodes. 
Fig. 9 shows the results for the five schemes based on 
different packet sizes. As expected, energy x delay increases 
with the packet size. Fig. 10 shows that, as the number of 
nodes increase, energy x delay becomes greater for all 
schemes. For all these figures, the binary scheme performs 
the best however if sensors are not CDMA capable, then the 
3-level scheme is the best. 

The simulation results show that: 

• The chain-based binary scheme is approximately 
eight times better than LEACH and 130 times better 
than direct for a 50m x 50m network in terms of 
energy x delay for sensor networks with CDMA 
nodes. 

• The chain-based binary scheme is approximately 
five to 13 times better than LEACH and 80 or more 
times better than the direct scheme for a 100m x 
100m network in terms of energy x delay for sensor 
networks with CDMA nodes. 

• The chain-based three level scheme is approximately 
four times better than PEGASIS and 60 times better 
than direct for a 50m x 50m network in terms of 
energy x delay for sensor networks with non-CDMA 
nodes. 

• The chain-based 3-level scheme is approximately 
three to five times better than PEGASIS and up to 
140 times better than direct for a 100m x 100m 
network in terms of energy x delay for sensor 
networks with non-CDMA nodes. 

• The chain-based schemes show a more balanced 
energy dissipation among the sensor nodes to have 
full use of the complete sensor network. 



9 Conclusions and Future Work 

In this paper, we describe three new protocols for wireless 
sensor networks. One of these protocols, PEGASIS, is a 
greedy chain protocol that is near optimal for a data- 
gathering problem in sensor networks. PEGASIS out- 
performs LEACH by eliminating the overhead of dynamic 
cluster formation, minimizing the distance nonleader nodes 
must transmit, limiting the number of transmissions and 
receptions among all nodes, and using only one transmis- 
sion to the BS per round. Nodes take turns to transmit the 
fused data to the BS to balance the energy depletion in the 
network and preserve the robustness of the sensor web as 
nodes die at random locations. Distributing the energy load 
among the nodes increases the lifetime and quality of the 
network. Our simulations show that PEGASIS performs 
better than LEACH by about 100 to 200 percent when 
1 percent, 25 percent, 50 percent, and 100 percent of nodes 
die for different network sizes and topologies. The 
improvements will be slightly lesser compared to 
LEACH-C, which doesn't have the cluster formation over- 
head in each round. PEGASIS shows an even further 
improvement as the size of the network increases. 

The other two protocols described in this paper that 
reduce the energy as well as delay for data gathering in 
sensor networks are a chain-based binary scheme for sensor 
networks with CDMA nodes and a chain-based 3-level 
scheme for sensor networks with non-CDMA nodes. The 
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Fig. 8. Performance results for a 100m x 100m network with BS 
locations at (50, 150), (50, 200), and (50, 300). The packet size is 
2,000 bits and the number of nodes in the network is 100. 
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Fig. 10. Performance results for a 100m x 100m network with the BS 
location at (50, 300). The packet size is 2,000 bits and the number of 
nodes vary at 50, 100, and 200. 



binary scheme performs better than direct, PEGASIS, and 
LEACH. It performs better than LEACH by a factor of about 
eight, about 10 times better than PEGASIS, and more than 
100 times better when compared to the direct scheme. In 
these experiments, the interfering transmissions contribu- 
tions are assumed to be 1/128 the value of their transmis- 
sion energy. With non-CDMA nodes, the interfering energy 
is the amount received from unintended transmissions. The 
chain-based 3-level scheme with non-CDMA outperforms 
PEGASIS by a factor of four and is better than direct by a 
factor of 60. The scheme outperforms PEGASIS by dividing 
the chain in "groups" and allowing simultaneous transmis- 
sions among pairs in different groups. While energy is still 
minimal, the delay is decreased from 100 units to 15 units. 

It is not clear as to what is the optimal scheme for 
optimizing energy x delay is in a sensor network. Since the 
energy costs of transmissions depend on the spatial 
distribution of nodes, there may not be a single scheme 
that is optimal for all sizes of the network. Our preliminary 
experimental results indicate that, for all small networks, 
the binary scheme performs best as minimizing delay 
achieves best result for energy x delay. With larger net- 
works, we expect that nodes in the higher levels of the 
hierarchy will be far apart and it is possible that a different 
multilevel scheme may outperform the binary scheme. 
When using non-CDMA nodes, interference effects can be 
reduced by carefully scheduling simultaneous transmis- 
sions. Since there is an exponential number of possible 
schedules, it is intractable to determine the optimal 
scheduling to minimize energy x delay cost. A practical 
scheme to employ will depend on the size of the playing 
field and the distribution of nodes in the field. 

In order to validate our assumptions, more detailed 
models and a network simulator, such as ns-2, need to be 
used for detailed evaluations. Based on our C simulations, 
we expect that PEGASIS will outperform LEACH and its 
variants and direct protocols in terms of system lifetime and 
the quality of the network for minimizing energy. We also 
expect that the binary chain-based scheme and the 3-level 
chain-based scheme will outperform direct, LEACH and its 
variants, and PEGASIS in terms of energy x delay. We also 
restricted our discussions to the d 2 model for energy 
dissipation for wireless communications in this paper. In 
our future work, we will consider higher order energy 
dissipation models and develop schemes to minimize 
energy and energy x delay costs for this type of data 
gathering and other applications in sensor networks. 
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Abstract. This paper presents the performance of adaptive antennas in a 1/3 reuse frequency hopping GSM 
network using conventional beamforming. It mainly focuses on C/l improvement for the purpose of capacity 
enhancement. The performance evaluation has been conducted by means of network computer simulations, where 
measured time-space radio channel impulse responses are applied for the desired user in the network. Meas- 
urements with an M = 8 element uniform linear array were conducted in the cities of Aarhus, Denmark, and 
Stockholm, Sweden. The simulated C/I improvement shows an almost 10* log 10 (M) behavior for low azimuth 
spread values. For large values of azimuth spread (relative to the antenna beamwidth), the performance gain is 
reduced significantly. For an azimuth spread of 10°-12°, which has been measured in urban macro-cellular envir- 
onments, the C/l gain for M = 8 is reduced to approx. 5.5-7.5 dB (which should be compared to the theoretical 
value of 9 dB for a point source). The designed DoA algorithm is very robust to co-channel interference and only a 
small degradation in performance is observed for single element C/I down to approx. -8 dB. We conclude that the 
designed beamforming implementation facilitates a potential capacity gain of x 3 in a 1/3 reuse FH-GSM network 
for an array size of M = 4-6. 

Keywords: adaptive antennas, GSM, capacity. 



1. Introduction 

In recent years, there has been an enormous interest in adaptive base station antenna techno- 
logy for GSM (among other 2nd and 3rd generation land mobile communication systems) for 
both enhanced capacity and coverage range [1, 2, 25]. The theory and the potential perform- 
ance of adaptive antennas has been described in numerous publications [3-5, 22]. However, 
due to several practical aspects of adaptive antennas, such as hardware implementation [8] and 
frequency assignment in a network with only a few adaptive base stations [6], a commercial 
exploitation of adaptive antennas may likely take a slow evolutionary path [7]. The promising 
theoretical results still need to be validated by large-scale field trials [9]. 

The EU funded ACTS project TSUNAMI II [10] was established in 1995 to study the 
feasibility and cost efficiency of deploying adaptive antennas for 3rd generation mobile com- 
munication systems. A major effort of the TSUNAMI II project was put into prototyping and 
conducting field trials with an adaptive antenna base station for GSM- 1800 [25]. Being part- 
ner in TSUNAMI II, the Center for Personkommunikation at Aalborg University, Denmark, 
designed a robust beamforming algorithm to be used for the field trials [18]. 

This paper concentrates on the design and performance of the proposed beamforming 
algorithm. There were several reasons for mainly concentrating our effort on conventional 
beamforming: (a) the issue of downlink transmission in a frequency division duplex (FDD) 
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system and (b) the TSUNAMI II adaptive antenna base station used in the field trial was 
incapable of performing instantaneous weight calculation, which, for example, is required for 
Optimum Combining (OC) techniques [23] or Maximal Ratio Combing [12]. The paper is or- 
ganized as follows. In Section 2, we discuss some GSM system aspects relevant for the design 
of the beamforming algorithm. Section 3 briefly describes the time-space measurements used 
for the network simulations, while Section 4 presents the simulated network configuration. 
The proposed beamforming algorithm is derived in Section 5 and the performance for uplink 
and downlink are shown in Sections 6 and 7, respectively. Finally, conclusions are drawn 
about the work in Section 8. 



2. GSM Capacity Aspects 

It is anticipated that slow Frequency Hopping (FH) is a powerful feature for enhancing the 
capacity of GSM networks. In fact, most GSM operators see FH as the most promising and 
cost-efficient solution for enhancing their network capacity [11]. Thanks to channel coding 
and interleaving in GSM [31], FH provides both frequency and interference diversity [11]. 
When considering high-capacity GSM networks, it is therefore deemed mandatory to consider 
adaptive base station antennas along with other optional capacity-enhancing features in GSM, 
such as: FH, Power Control (PC), and discontinuous transmission (DTX). 

In a conventional non-FH GSM network, the minimum frequency reuse distance for a 
three-sectored base station configuration is K/n = 4/12 [11], where K is the cluster size 
and n is the number of frequencies used within a cluster. The 4/12 frequency reuse is al- 
ways applicable for the BCCH (Broadcast Channel) carrier, because continuous transmission 
with frill power is mandatory. If using synthesized FH on non-BCCH carriers, the number 
of hopping frequencies (MA list) can be equal to or higher than the number of base station 
transceivers (TRXs). The term fractional loading is often used for MA lists larger than the 
number of TRXs. In FH GSM, two parameters can thus be used to control the interference 
level: the cluster size, K, and the fractional loading, L. Both network simulations and field 
trial results have verified that a GSM network can operate with a frequency reuse of 1/3 with 
a fractional loading of approx. 25-30%. 

By deploying adaptive antenna base stations in GSM, it was our goal to potentially increase 
the load of a 1/3 reuse network up to the hard-blocking level of 80-90% load, thus achieving 
a capacity increase on the order of x 3 by adaptive antennas. 1 However, this potential capacity 
increase has the condition that a sufficient spatial filtering (beamforming) gain is achieved. 
By assuming a linear relation between network load and the increase in interference level, 
the minimum required spatial filtering gain can be estimated to approx. 5 dB. Applying a 
10 * log 10 (Af ) relationship for the spatial filtering gain for a point source, an array size of 
M = 4 should be sufficient. However, when considering azimuth multipath spread, it was 
initially analyzed in [18] that 6-8 elements could be necessary in urban environments with a 
median azimuth spread of 10° or higher. In the following, we mainly limit our study to the 
case of M = 8. 



1 In [18], a more detailed analysis of the potential capacity gain, including the aspect of BCCH carrier, trunking 
efficiency, etc., can be found. 
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3. Space-Time Channel Sounding 

3.1. Description of the Testbed 

The TSUNAMI II Stand-alone testbed was built with the purpose of performing time-space 
channel sounding and verifying beamforming algorithms [14]. The testbed consists of a trans- 
mitter (mobile unit) and an adaptive array receiver (base station unit). The base station unit has 
9 parallel receiver chains. Eight of the receivers are connected to an 8-element antenna array, 
while the ninth receiver is connected to a sector reference antenna. The carrier frequency is 
within the GSM- 1800 frequency band. For the results presented here, an 8-element uniform 
linear array antenna was used. The antenna elements are vertically polarized dipoles mounted 
in front of a reflecting ground plane; the horizontal element spacing is half a wavelength. The 
bandwidth was 200 kHz according to the GSM specifications [31] in the narrowband channel 
sounding mode and 4 MHz according to UMTS in the wideband sounding mode [32], 

3.2. Description of the Narrowband Measurements, Aarhus 

The measurement data from Aarhus, Denmark, was obtained using the 200 kHz bandwidth. 
The city of Aarhus is characterized by an irregular street layout and 3-5-story buildings. The 
adaptive base station antenna array was installed at a hotel on three different floors, corres- 
ponding to the heights of 20 m, 26 m, and 32 m. The lowest position almost corresponded to 
the rooftop level of the surrounding buildings. Several measurement routes were measured, 
and each route was repeated for the three antenna heights. The simulation results obtained 
from Aarhus are for a route ranging in distance from 1.5 km to 2 km from the base station and 
within an azimuth region of +20° to -30° relative to the broadside of the antenna array. We 
characterize the measurement area as Typical Urban, TU. 

3.3. Description of the Wideband Measurements, Stockholm 

The measurements from Stockholm, Sweden, were obtained using the upgraded wideband 
mode of the testbed. The simulation results presented are from a route ranging in distance 
from 0.5 km to 1 km from the base station, and a azimuth region ranging from -20° to 0°. 
The base station antenna height was 21 m and the average building height was only a few 
meters lower. The measurement route for the analysis was close to a river crossing the city of 
Stockholm, which gave raise to a two-cluster multipath scenario. The propagation scenario is 
therefore characterized as Bad Urban, BU. The wideband data from Stockholm was bandpass- 
filtered in order to comply with the GSM bandwidth. The filtering process makes it possible 
to extract several narrowband GSM frequency channels and thereby emulate FH. 

3.4. Characterization of Azimuth Dispersion 

It was published [15] and also verified [16] that the Power Azimuth Spectrum (PAS) for a 
macro-cell base station can be well described by a Laplacian function for Typical Urban (TU). 
and Rural Area (RA): 




(1) 
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Figure 1. Distribution of azimuth spread for the base station antenna heights of 32 m (high) and 20 m (low) in 
Aarhus and 21 m in Stockholm [27]. 

where <p is azimuth and a is the Azimuth Dispersion (AS). For the Bad Urban (BU) case of 
Stockholm, the PAS could be well modelled by two clusters with different mean azimuths, 
where each cluster is described by a Laplacian function [27]. 

Figure 1 shows the cumulative frequency of measured AS values (averaged over 100 
wavelengths). The median AS is approx. 5° for the high antenna location in Aarhus and 10-12° 
for the low antenna locations of Aarhus and Stockholm. 

In Section 6, we will discuss how the AS impacts the spatial filtering gain of conventional 
beamforming. 

4. Network Simulation Model and Signal Description 

4.1. Network Simulation Model 

A simplified GSM network simulation model was developed in order to test and optimize 
various uplink and downlink beamforming algorithms under realistic network interference 
conditions [18]. The network model includes ten co-channel cells with a topology according 
to a 1/3 reuse network (tri-sectored base station configuration), see Figure 2. Cell #0 is the 
desired cell for the performance analysis and cells #1-9 are co-channel interfering cells. The 
path-loss model is k • r" 3 5 , where k is a constant and r the range. Log-normal fading with a 
standard deviation of 8 dB was applied. The network simulator can import both simulated and 
measured multipath channel data for the desired mobile in cell #0. Only simulated multipath 
data were used for the co-channel interfering signals from cells #1-9. A data file containing 
the time-space impulse responses along the trajectories sketched in Figure 2 was created for 
each cell. For the generation of time-space multipath data, the scatterer model described in 
[18] was used. 
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Figure 2. Cellular geometry of the co-channel model used for simulating a 1/3 reuse GSM network. The desired 
user follows a path in cell #0, while the interfering users in cells #1-9 move on the shown arcs. 

4.2. GSM System Aspects 

The network simulator can be configured to both synchronous and asynchronous modes. In 
the synchronous mode, all cells are full-time synchronized and the differences in propagation 
delay are omitted. In the asynchronous mode, a uniform random timing offset is applied to 
each co-channel cell. The interfering cells employ other training sequences than the desired 
user. FH is also a feature in the network simulator. For the narrowband measurements, decor- 
relation in fading between successive bursts (frequency diversity) is emulated by performing 
short spatial steps around the true mobile location for each burst. For the wideband mode, the 
FH is emulated according to the description in Section 3.3. Uplink DTX is simulated, whereas 
uplink Power Control (PC) was disregarded in the analysis. 

4.3. Signal Description 

The simulation model is based on the multidimensional impulse responses between every 
mobile in the network and the base station antenna array of the desired cell. The base station 
antenna topology is a uniform linear array; 8 elements with half a wavelength element separa- 
tion. The co-channel interfering users move forward and backward on the arc segments shown 
in Figure 2. The desired user moves on a path in cell #0, for example as shown in Figure 2. 
The real measurement route determines the actual path. The received spatial data vector for 
symbol q in burst b is given by 

The index b is omitted in Equation (2) and in the sequel for clarity. It is important to note that 
the signal sample vector x q includes signal contributions from both the desired mobile and all 
9 interfering mobiles. 

5. Design of Beamforming Algorithm 



4 




This section describes the designed beamforming algorithm, which has been derived for the 
downlink case. We do, however, also suggest using the algorithm for uplink reception. 
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For downlink transmission, one would ideally like to know the instantaneous spatio- 
temporal impulse response matrix H 7 * for every burst number b. This matrix contains the 
impulse response from the mobile antenna to each element in the base station antenna array. 
The dimension is M x N, where N is the length of the temporal impulse response. 

H 7 ^ = [h^h 2 7X ...hf ...hj*]- ( 3 ) 

The vector h™ is the transmit spatial impulse response vector for the excess delay index q in 
burst b. The excess delay index q in the sequel is also referred to as the delay tap number. 
The dimension of h™ is M x 1. Unfortunately, with respect to obtaining H 7 *, GSM uses fre- 
quency division duplexing and therefore the instantaneous value of H 7 * cannot be found from 
the corresponding uplink estimate H**. Nevertheless, under the assumption of uncorrected 
scattering, the expectation value S^H 7 *!} is identical to £{|H**|}. 

Since we are using a random FH-GSM network, it is difficult for the base station beam- 
former to achieve directional information about co-channel mobile stations. This is because 
the downlink TDMA frame structure is 3 burst periods ahead of the corresponding uplink 
TDMA frame structure [31]. Downlink null steering techniques for the suppression of emitted 
interference to co-channel mobiles in adjacent cells [21] has therefore not been investigated. 
With the lack of positioning information for co-channel mobiles, it is desirable to use the 
transmit weight vector w maximizing the energy received at the desired mobile station over 
the total transmitted energy. The following performance criterion was used for computing 
the transmit vector w (index 0 corresponds to the desired mobile in cell #0 of the simulation 
model): 



w = «g w m«^j wJf(jW-J)w Jj. (4) 

where R 7 *' 0 = E^Ch™* 0 )" • h™'° is the sum of multipath covariances of the desired mobile 
and J the identity matrix. Here, H means transpose and complex conjugated. In [18], it was 
argued that for most radio channel conditions, less than 1 dB is lost by restricting the search 
of the transmit vector to the form 

w = const x a(0 o , f™) , ( 5 ) 

which implies that the weighting vector is simply a scaling of a steering vector a(0, /) with 
the azimuth direction 0 O and that f n is the downlink frequency. The steering vector a(0, /) 
is given by 

a(0, /) = [1, exp(-7'27r/A sin(0)/c), . . . , exp(-j2nf(M - 1)A sin(0)/c)] r , (6) 

where A is the antenna element spacing and c the speed of light. The constraint on w given in 
Equation (5) reduces the degrees of freedom from 2(Af - 1) to only one: 0. The weight vector 
of the form given in Equation (5) is also well known as conventional beamforming [22]. As no 
downlink entities can be estimated at the base station to solve Equation (4), we simply replace 
7Xby RX. This approximation will not lead to any substantial performance degradation if the 
beam pattern for up- and downlink are similar. The best transmit direction is given by: 

0 O = arg, max (e £ |a(0, /«*)»h^°| 2 . (7) 
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Equation (7) can also be interpreted as a low resolution Direction of Arrival (DoA) estim- 
ator. The more practical implementation of this DoA algorithm for GSM is described in the 
following section. 

5.1. DoA Estimation Algorithm for GSM 

This section describes the practical implementation of the DoA algorithm for GSM including 
the impulse response estimate h**'° and the expectation £{}. The GSM traffic burst contains a 

midamble of 26 known symbols, the training sequence. The impulse response estimate h* x, ° 
is obtained by a cross-correlation between the received signal and a Minimum Shift Keying 
(MSK) mapped version of the known training sequence I*** 0 (16 or 26 bits may be used). 2 

s+25 1 

where \ q is the spatial signal vector given in Equation (2) and * the complex conjugation. For 
each antenna element, the estimated impulse response \s N — 11 symbols long (±5 symbols 
time lag) and the dimension of Hf*'° becomes M x 1 1 . A crucial issue for the impulse response 
estimation is the poor cross-correlation properties of the GSM training sequences. This is es- 
pecially an issue for a synchronized network. For equalization in GSM, it is typically assumed 
that the Power Delay Profile is within 5 symbols. However, in order to reduce the impact of co- 
channel interference on the DoA estimate (due to poor cross-correlation properties), it is here 
further dictated that most of the energy in the impulse response received from the direction 0 O 
is kept within a window of only 3 symbols s e {ko, k 0 + 1 , k 0 + 2} where k 0 = 1 , . . . , 9. This 
constraint is based on the fact that the pulse-shaping filter for the linear MSK/OQAM model 
of GMSK has an effective duration of approx. 2 symbol periods and the assumption that most 
energy received from a certain azimuth direction is within 1 symbol duration (3.7 /xs). This 
assumption is valid for both the TU scenario of Aarhus and the BU two-cluster scenario of 
Stockholm. Here, the energy received from the second cluster with a long excess delay has a 
different azimuth direction than the energy received from the 1st cluster with low excess delay. 
The expectation E{] in Equation (7) is implemented by averaging over B bursts. From analysis 
of the burst pattern in DTX mode of GSM, it was suggested in [18] to use B = 21. This 
corresponds to an averaging window of approx. 100 ms in the non-DTX mode and approx. 
520 ms (slightly more than a SACCH multiframe [31]) in the DTX mode. It should be noted 
that for slow-moving or stationary mobile stations, the time averaging over B bursts does not 
ensure any averaging over fast fading unless FH is used to obtain frequency diversity. Hence, 
FH is very essential for the performance of downlink beamforming in a GSM network with a 
mixture of fast- and slow-moving mobile stations. 

When using random FH, the uplink log-normal fading process in a combination with uplink 
DTX and PC of co-channel interfering mobiles will accommodate a very large variation in 
interference level. Some bursts will be received as heavily interfered, whereas others will be 
received almost free of interference. Because of the non-ideal cross-correlation properties of 
the GSM training sequences, a "hit" by a strong co-channel interfering signal may sometimes 
result in an erroneous estimation of 0 O even when averaging over B bursts [18]. This has been 
the motivation for applying logarithmic power averaging instead of linear, as the latter has 



26 bits give better noise and interference suppression than 16 bits, but the autocorrelation properties worsen. 
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Figure 3. Selection of fixed beam direction 9$. 



been found more resistant to this sort of interference distribution, see Equation (9). For the 
considered case of an 8-element ULA covering a sector of 120°, [18] decided to use a grid 
of beams with a crossover depth of 0.5 dB between adjacent beams in order to reduce the 
computational load of the algorithm. This constraint resulted in 22 fixed beam directions for 



© = [-72.7, -59.4, -49.8, -41.9, -35.0, -28.5, -22.6, -16.6, -11.0, 

-5,5, 0.0, 5.5, 11.0, 16.6, 22.6, 28.5, 35.0, 41.9, 49.9, 59.4, 72.7, 88.2]. 

The time averaged received power P(k, 9),k € {1, . . . , 9} and 0 e 0 are now given by: 



while the azimuth direction 0 O > which is selected for the beamforming transmit vector 
a(#o, Z 7 *), is given by 



The selection process of Equation (10) is illustrated in Figure 3. There are 9 possible values 
of k (excess delay) and 22 fixed values of 9 (azimuth direction). 

The computational and memory requirements of the algorithm are rather low relative to 
the performance of modern DSPs. We do, therefore, believe that the algorithm is feasible for 
real-time implementation. 

5.2. Aspects of Analogue Beamforming 

In the derivation of the beamforming algorithm presented in the previous section, we assumed 
a digital beamforming implementation (which is the case for the TSUNAM II adaptive antenna 
base station). However, the concept of digital transmit beamforming may not be cost efficient 
for commercial implementation. Analogue beamforming, by e.g., using a Butler matrix, is an 
obvious alternative [19]. Fortunately, because of the constraints on the transmit vector w set 
in Equation (5), the algorithm is also applicable for an analogue beam-switching approach. In 
Section 6.3, we also include results for both an 8 x 8 and a 4 x 4 Butler matrix implementation. 
The steering directions for the 8 x 8 Butler matrix are: 




(9) 



#0 — ar g0<E0,*e[l 9] 



maxP(0,A;). 



(10) 



0 = [-60.7, -38.7, -22.5, -7.2, 7.2, 22.5, 38.7, 60.7] . 



(11) 



For a 4 x 4 Butler matrix, the following fixed beam directions are: 



0 = [-48.6, -14.5, 14.5, 48.6]. 



(12) 
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The crossover depth between adjacent beams for both the 4 x 4 and 8 x 8 Butler matrix 
solutions is approx. -4 dB, which may significantly degrade the average performance of the 
adaptive antenna. For the 8 x 8 implementation, the crossover depth is narrow compared to 
the azimuth spread measured in urban environments and the depth will partly be filled by mul- 
tipath signals. This is different for a 4 x 4 Butler matrix implementation because of the wider 
depths and we therefore suggest the use of two Butler matrix implementations with a half 
beamwidth offset in beam directions, resulting in 8 beam directions. Such an implementation 
can be found in [20] for the case of two 6x6 Butler matrixes using orthogonal polarization. 

6. Uplink Simulation Results 

Although the beamforming algorithm described in the previous section was derived for down- 
link, we will first look into its uplink performance. This is partly because the DoA estimation is 
an uplink issue and, secondly, because the layout of the simulated network is better suited for 
investigating uplink performance. Downlink performance results will be given in Section 7. 

6.1. DoA Characteristics 

The designed DoA algorithm should preferably estimate the azimuth direction of the strongest 
signal path from the desired user (averaged over short-term fading). For low AS values, the 
mean DoA estimate is mostly identical to the geometrical azimuth of the mobile station with 
some variation. However, in environments with large azimuth spread (such as the Bad Urban, 
two-cluster scenario from Stockholm), the DoA estimate can be significantly biased from 
the geometrical azimuth of the mobile station. Figure 4 gives an example comparing the 
geometrical and estimated DoA as a function of time. The measurement route is from Aarhus, 
repeated for the 3 different base heights: 32 m, 26 m, and 20 m (note the vertical shift of 
the curves). It can be observed that the mean of the DoA estimate follows the geometrical 
azimuth for all three heights, but the variation in DoA estimates increases as the antenna 
height is lowered (corresponding to an increase in AS). 

6.2. Beamforming Gain Dependence on Single Element C/I 

The DoA algorithm was designed with the goal of being robust to interference. If simply 
applying 10 * log 10 (M) for the maximum beamforming gain where M = 8 and assuming 
a required C/I of 9 dB at the input of the detector [31], then the worst case single element 
C/I should at least be 0 dB. This worst case C/I of 0 dB is an average over multipath fading, 
and hence approximately half the bursts will be received with a negative instantaneous single 
element C/I value. Figures 5(a,b) show the DoA estimates for a small part of the Stockholm 
route for the cases of -2 dB and -12 dB C/I, respectively. 

Figure 5 clearly shows an enlarged fluctuation in the DoA estimate when the C/I is lowered 
from -2 dB to -12 dB. For very low C/I, the DoA algorithm sometimes tracks strong co- 
channel interfering mobiles instead of the desired mobile. In Figure 6, the beamforming gain 
(over single element) is shown as a function of C/I (for the same part of the route as shown 
in Figure 5). It can be observed that the gain significantly decreases when the C/I becomes 
lower than approx. -8 dB. This is, however, not fatal, as the beamformer will operate at single 
element C/I of 0 dB or more as discussed above. The length of the GSM training sequence 
(i.e. cross-correlation properties) and the length of the burst averaging B mainly determine the 
"breakdown" point of the algorithm. 
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Figure 4. Characteristics of the DoA for the Aarhus measurement route at 3 different base station antenna heights. 
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Figure 5. DoA estimate for a small part of the Stockholm measurement route: (a) C/I = -2 dB; (b) C/I = -12 dB. 



6.3. C/I Gain Distribution 



Figure 7 shows the simulated C/I improvement for the Aarhus measurement route for the 
antenna height of 32 m. As mentioned, the azimuth sweep of the measurement route ranges 
from +20° to -30° and the distance to the base station varies between 1.5 km and 2 km. For 
clarity, the C/I improvement is shown as an average over 40 bursts. 

It can be observed from Figure 7 that the local C/I improvement varies significantly along 
the measurement route, i.e., the improvement is position-dependent. It can also be observed 
that the C/I improvement is almost uncorrelated with the single element C/I, which is usually 
better than -5 dB (note the vertical shift of the curve by -10 dB in the plot!). Figure 8 shows the 
cumulative frequency of C/I values, averaged over 8 bursts, for the Aarhus measurement route 
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Figure 7. (a) C/I improvement (over single element) versus time for the antenna height of 32 m, Aarhus. (b) Single 
element C/I. 



for both single element and for beamforming (M = 8). The 8-burst averaging period corres- 
ponds to the interleaving period of the GSM full rate speech channel. We chose to use the 
10% outage level to identify the beamforming gain. Hence, from Figure 8, the beamforming 
gain is 8.3 dB. 

Table 1 summarizes the C/I gain of different antenna configurations for an outage level of 
10%. The results are shown for both measured channel data from Aarhus and for simulated ra- 
dio channel data (having the same median AS). The results from Aarhus show that a 4-element 
antenna array is sufficient to obtain a beamforming gain on the order of 5 dB, provided the 
number of fixed steering directions is 8 (or more), hence facilitating a full load in a 1/3 reuse 
FH-GSM network (see Section 2). For the case of AS = 5°, there is a very good agreement 
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Figure 8. Cumulative frequency of the single element C/I and beamforming C/I (A/ = 8) for the Aarhus 
measurement route and 32 m base station height. 

Table 1. The C/I gain at the 10% outage level for different antenna array 
configurations: AS = median azimuth spread; F = number of fixed beam 
directions. The measured results are from Aarhus. The simulated results 
were generated using the TU model in [18]. 
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between the measured and simulated radio channel data, whereas for the case of AS = 10° the 
discrepancy is on the order of 1-1.5 dB. It should be noted that the simulated TU channel data 
assumes a Gaussian PAS [18] instead of a Laplacian one, as later found in [15]. The Gaussian 
model seems to predict too pessimistic results for conventional beamforming. 

Figure 9 shows similar results as Figure 8, but for the severe BU Stockholm route. For a 
part of this route, two separate scattering clusters located at -30° and +25° in azimuth with 
almost equal mean power were observed [27]. At the 10% outage level, the beamforming 
gain is only 5.3 dB, which is a significant reduction compared to the results for the Aarhus 
route. Furthermore, the curve for beamforming starts to approach the single element curve 
for a cumulative frequency of less than 5%. These results from Stockholm indicate that it 
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Figure 9. Cumulative frequency of the single element C/I and the beamforming C/I (M = 8) for the Stockholm 
measurement route. 

can become difficult to obtain large gain values of conventional beamforming in some severe 
urban propagation environments. 

6.4. Analytical Results for Conventional Beamforming 

When the AS becomes large relative to the antenna beamwidth of conventional beamforming, 
all of the impinging waves will not be captured and the Array Factor (AF) becomes saturated. 
In Figure 10, the relative loss in captured signal power is shown as a function of the 3 dB 
beamwidths for different AS. The curves are computed analytically, assuming a Laplacian 
PAS. The ideal AF for a single wave is also shown in the figure (dashed line). 

For example: for M = 8, the ideal AF is 9 dB and the 3 dB beamwidth is approximately 
13°. For an AS of 10°, the captured power is reduced by 2.2 dB. Hence, the effective AF is 
reduced to (9-2.2) dB = 6.8 dB. Similarly for M = 4 and AS = 10°, the effective AF is (6- 
0.8) dB =5.2 dB. The additional gain from doubling the number of antenna elements from 4 
to 8 is only 1 .6 dB. In urban areas, therefore, we suggest the limiting of the number of antenna 
elements to 6-8 for cost-efficiency reasons. Inspection of Figure 10 for M = 8 and M = 4 
gives the following effective AF values at 5° and 10° azimuth spread: 

• M = 8, a = 5°: (9-0.8) dB = 8.2 dB 

• M = 8, a = 10°: (9-2.2) dB = 6.8 dB 

• M = 4, o = 5°: (6-0.2) dB = 5.8 dB 

• M = 4, a = 10°: (6-0.8) dB - 5.2 dB 

By comparing these gain values, obtained by simple analytical calculations, to the values 
obtained by extensive network simulations (see Table 1), a very good agreement can be 
found. Hence, when the median AS of a certain environment is known, the performance of 
adaptive base station antennas using conventional beamforming can be accurately predicted 
using Figure 10 and thereby avoiding heavy computer simulations and time-consuming test 




measurements. In [28, Figure 6], it was suggested to predict the AS from cross-correlation 
statistics of (existing) 2-branch space diversity configurations. 



7. Downlink Simulation Results 



In this section, the performance improvement for downlink is analyzed. For the downlink 
performance analysis, only simulated channel data have been used. This was in order to ensure 
a uniform azimuth location of the desired user in cell #0. The following performance measure 
was used: 

Array , Single 

^L" 1 ~ .Array r Single • V J 

'LJ ' L,L 

where Cl.l is the power delivered by the serving base station in cell L to the desired mobile 
in cell L and U L j is the interfering power delivered from the serving base station in cell L to a 
co-channel mobile in cell /. In the simulator, L and / are restricted to 0 and 1, respectively. The 
superscripts "Single" and "Array" denote single element and antenna array cases, respectively. 
All values in Equation (13) have been averaged over fast fading (40 bursts). Calculation of the 
instantaneous powers is done as follows: 

C = |w" -H™ | 2 (14) 
/ = |w // .H^ / | 2 , (15) 

where is the impulse response matrix between the serving base station and the desired 
mobile in cell L and H£*. is the impulse response matrix for the serving base station in cell 
L to the co-channel mobile in cell i. w is given by Equation (5). The downlink interference 



Performance of Adaptive Antennas in FH-GSM Using Conventional Beamforming 269 

100| . r 



80 



I I— 



Desired user in cell 0 




0 10 20 30 40 50 60 70 80 90 
Time [Sec] 

Figure IL Azimuth of the desired mobile in cell #0 and the interfered mobile "II" in cell #1 versus time. 
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Figure 12. C/I improvement according to Equation (13). The azimuth region in which "II" potentially gets 
interfered is shown by the vertical lines. 



improvement for the interfered mobile in cell #1 0 = 1) was analyzed. The azimuth of the 
desired mobile in cell #0 and the interfered mobile "II" in cell #1 are shown in Figure 11. 
The desired mobile performs an azimuth sweep of -60° to 60°, while the interfered mobile 
moves between -20° and 20° relative to the base station in cell #0. It can be observed that 
the interfered mobile in cell #1 has a larger angular speed than the desired mobile in cell #0, 
which of course is unrealistic. The reasoning for this is to ensure better statistical output from 
a short Monte Carlo simulation, and it has no significant influence on the mean performance 
results. 

From Figure 12, it can be observed that when the desired mobile in cell #0 is in the 
same azimuth region as the interfered mobile in cell #1, the performance improvement of 
conventional beamforming is almost an on/off function. This is expected, because when the 
azimuth separation between the interfered and desired mobile is less than half the beamwidth 
(AS = 0°), the beamformer provides no or only little interference suppression. For azimuth 
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Table 2. Downlink performance improvement (over single element) 
for different configurations of the base station antenna array (AS = 
7.5°). 



Configuration 


4 Antennas 




8 Antennas 






4 Beams 


8 Beams 


8 Beams 


22 Beams 


C/Igain 


4.1 dB 


5.3 dB 


7.3 dB 


7.8 dB 



separations of more than half the beam width, the interference suppression is determined by 
the side lope level. If a non-uniform window function is used, the side lope level may be 
improved for the penalty of increased beamwidth. For a random FH network, our simulation 
results have shown that a uniform window function gives the best mean performance. 

' The interference situation shown in Figure 12 is without FH. In a random FH network, the 
interferer situation will change from time slot to time slot and effectively perform an averaging 
of the interference improvement pattern. The average improvement over a single element is 
7.8 dB for the simulation of digital beamforming with 22 fixed beams and M = 8 (AS = 
7.5°). 

Improvements for the different beamforming and antenna array configurations are sum- 
marized in Table 2. The simulated downlink results shown in Table 2 are in good agreement 
with the uplink results shown in Table 1. 

7.1. Field Trial Results 

The proposed beamforming algorithm has been implemented and tested live in the TSUNAMI 
II, GSM- 1800 adaptive antenna base station demonstrator at Orange PCS Ltd in Bristol, U.K. 
An extensive description of the field trial and the results can be found in [24] and [25]. It was 
concluded in [24] that the described beamforming algorithm (referred to in [24] as "(TRB) 
Grid of Beams"), among other algorithms, showed the best overall link performance in terms 
of quality. In the micro-cellular field trial, the mean downlink improvement was estimated at 
5.7 dB (M = 8). This result is in the same range as our results from Stockholm. Considering 
the relative large AS expected for a micro-cellular base station installation, this is a satisfactory 
result for the algorithm. 

8. Conclusion 

A robust DoA algorithm for downlink conventional beamforming in random FH-GSM has 
been presented. The algorithm was thoroughly verified by means of network computer simula- 
tions using both simulated and measured radio channel data. It was also tested in a GSM- 1800 
field trial in the U.K. with good results. The Azimuth Spread (AS) was found to be an import- 
ant parameter for the performance of conventional beamforming. The simple 10 * log 10(M) 
expression for the spatial filtering gain of a point source was too optimistic for urban areas. 
Space-time measurements were conducted in the two dissimilar cities of Aarhus and Stock- 
holm, where the median azimuth spread was in the range of 10°-12° for low macro-cellular 
base station antenna heights (approx. 20 m). The gain for an 8-element array was reduced from 
the theoretical value of 9 dB for AS = 0° to 7.4 dB and 5.3 dB for the Aarhus and Stockholm 
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measurement results, respectively. We conclude that 4 to 6 antenna elements provide a spatial 
filtering gain on the order of 5 dB in most urban macro-cellular environments, thus allowing a 
1/3 reuse FH-GSM network to be fully loaded and thereby achieve a capacity increase of x3. 
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Abstract 

In this paper we consider the problem of data collection 
from a sensor web consisting of N nodes, where nodes 
have packets of data in each round of communication that 
need to be gathered and fused with other nodes ' packets 
into one packet and transmitted to a distant base station. 
Nodes have power control in their wireless 
communications and can transmit directly to any node in 
the network or to the base station. With unit delay cost 
for each packet transmission, if all nodes transmit data 
directly to the base station, then both high energy and 
high delay per round will occur. In our prior work [6], 
we developed an algorithm to minimize the energy cost 
per round, where a linear chain of all the nodes are 
formed to gather data, and nodes took turns to transmit to 
the base station. If the goal is to minimize the delay cost, 
then a binary combining scheme can be used to 
accomplish this task in about log N units of delay with 
parallel communications and incurring a slight increase 
in energy cost. The goal is to find data gathering schemes 
that balance the energy and delay cost, as measured by 
energy*delay. We conducted extensive simulation 
experiments with a number of schemes for this problem 
with 100 nodes in playing fields of 50m x 50m and 100m x 
100m and the base station located at least 100 meters and 
200 meters, respectively, from any node. With CDMA 
capable sensor nodes, a chain-based binary scheme 
performs best in terms of energy* delay. If the sensor 
nodes are not CDMA capable, then parallel 
communications are possible only among spatially 
separated nodes, and a chain-based 3 level hierarchy 



scheme performs well. These schemes perform 60 to 100 
times better than direct scheme and also outperform a 
cluster based scheme, called LEA CH [3]. 



1. Introduction 

Inexpensive sensors are deployed for data collection 
from the field in a variety of scenarios including military 
surveillance, building security, in harsh physical 
environments, for scientific investigations on other 
planets, etc. [2,4,13]. A sensor node will have limited 
computing capability and memory, and it will operate 
with limited battery power. These sensor nodes can self 
organize to form a network and can communicate with 
each other in a wireless manner. Each node has transmit 
power control and an omni-directional antenna, and 
therefore can adjust the area of coverage with its wireless 
transmission. For example, a sensor network can be used 
for detecting the presence of potential threats in a military 
conflict. Since wireless communications consume 
significant amounts of battery power, sensor nodes should 
be energy efficient in transmitting data [5,10,12]. Figure 1 
shows a 100-node fixed sensor network in a playing field 
of size 50m x 50m with the base station (BS) fixed and far 
away from all the sensor nodes. 
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Figure 1. Random 100-node topology for a 50m x 50m 
network. BS is located at (25, 150), which is at least 100m 
from the nearest node. 

In this paper we assume the following: 

• Each sensor node has power control and the 
ability to transmit to any other node or directly 
to the BS [5,7]. 

• Our model sensor network contains 
homogeneous and energy constrained sensor 
nodes with uniform energy. 

• Every node has location information. 

• There is no mobility. 

An important operation in a sensor network is 
systematic gathering of data from the field, where each 
node has a packet of information in each round of 
communication [3]. In this operation, data sensed by the 
nodes need to be combined into a single message and sent 
to a distant base station. This data fusion among the 
sensor nodes requires wireless communications. The 
amount of energy spent in transmitting a packet has a 
fixed cost in electronics and a variable cost that depends 
on the distance of transmission. Receiving a data packet 
also has a similar fixed energy cost in electronics. 
Therefore, to conserve energy short distance 
transmissions are preferred. In order to balance the energy 
spent in the sensor nodes, nodes should take turns 
transmitting to the BS, as this is an expensive 
transmission. 

In each round of this data-gathering application, all data 
from all nodes need to be collected and transmitted to the 
BS, where the end-user can access the data. A simple 
approach to accomplish this task is for each node to 
transmit its data directly to the BS. Since the BS is located 
far away, the cost to transmit to the BS from any node is 
high, and therefore, the total energy cost per round will be 
high. In sensor networks, data fusion helps to reduce the 
amount of data transmitted between sensor nodes and the 



BS. Data fusion combines one or more data packets from 
different sensor measurements to produce a single packet 
as described in [3]. The LEACH protocol presented in [3] 
is an elegant solution to this data collection problem, 
where a small number of clusters are formed in a self- 
organized manner. A designated node, the cluster head, in 
each cluster collects and fuses data from nodes in its 
cluster and transmits the result to the BS. LEACH uses 
randomization to rotate the cluster heads and improves the 
energy cost per round by a factor of 4 compared to the 
direct approach for the 1 00 node network of Figure 1 . 

Recently, we developed an improved protocol called 
PEGASIS (Power-Efficient GAthering in Sensor 
Information Systems), which requires less energy per 
round compared to LEACH [6], The key idea in 
PEGASIS is to form a chain among the sensor nodes so 
that each node will receive from and transmit to a close 
neighbor. Gathered data moves from node to node, get 
fused, and eventually a designated node transmits to the 
BS. Nodes take turns to transmit to the BS so that the 
average energy spent by each node per round is reduced. 
Building a chain to minimize the total length is similar to 
the traveling salesman problem, which is known to be 
intractable. However, with the radio communication 
energy parameters, a simple chain built with a greedy 
approach performs quite well. PEGASIS protocol 
achieves up to 100% improvement with respect to energy 
" cost per round compared to the LEACH protocol. In this 
paper we will not describe PEGASIS, but evaluate 
PEGASIS and two new protocols in terms of 
energy* delay for the data gathering application. 

Our schemes can be modified appropriately if some of 
the stated assumptions about sensor nodes are not valid. 
If nodes are not within transmission range of each other," 
then alternative, possibly multi-hop transmission paths 
will have to be used. In fact, our chain based schemes will 
not be affected that much as each node communicates 
only with a local neighbor and we can use a multi-hop 
path to transmit to the BS. We need to make some 
adjustments in the chain construction procedure to ensure 
that no node is left out. Other schemes, including 
LEACH, rely on direct reach ability to function correctly. 
To ensure balanced energy dissipation in the network, an 
additional parameter could be considered to compensate 
for nodes that must do more work every round. If the 
sensor nodes have different initial energy levels, then we 
could consider the remaining energy level for each node 
in addition to the energy cost of the transmissions. The 
assumption of location information is not critical. The BS 
can determine the locations and transmit to all nodes, or 
the node can determine this through received signal 
strengths. For example, nodes could transmit 
progressively reduced signal strengths to find a close 
neighbor to exchange data. This would require the nodes 
to consume some energy when trying to find local 
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neighbors, however, this is only a fixed initial energy cost 
when constructing the chain. If nodes are mobile, then 
different methods of transmission could be examined. For 
instance, if nodes could approximate how often and at 
what speed other nodes are moving, then it could 
determine more intelligently how much power is needed 
to reach the other nodes. Perhaps, the BS can help 
coordinate the activities of nodes in data transmissions. 
Discussion of schemes with mobile sensor nodes is 
beyond the scope of this paper. 

Another important factor to consider in the data 
gathering application is the average delay per round. 
Here, we assume that data gathering rounds are far apart, 
and the only traffic in the network is due to sensor data. 
Therefore, data transmissions in each round can be 
completely scheduled. The delay for a packet 
transmission (we assume that all packets are 2000 bits 
long) is dominated by the transmission time as there is no 
queuing delay and the processing and propagation delays 
are negligible compared to the transmission time. With 
the direct transmission scheme, nodes will have to 
transmit to the base station one at a time, making the 
delay a total of 100 units (1 unit per transmission). The 
linear chain-based scheme, although energy efficient, will 
also require 100 units of delay as the transmissions are 
sequential. To reduce delay, one needs to perform 
simultaneous transmissions. The well known approach of 
using a binary scheme to combine data from N nodes in 
parallel will take about log N units of delay, although 
incurring an increased energy cost. Energy*Delay is an 
interesting metric to optimize per round of data gathering. 

Simultaneous wireless communications among pairs of 
nodes is possible only if there is minimal interference 
among different transmissions. CDMA technology can be 
used to achieve multiple simultaneous wireless 
transmissions with low interference. If the sensor nodes 
are CDMA capable, then it is possible to use the binary 
scheme and perform parallel communications to reduce 
the overall delay. However, the energy cost may have to 
go up slightly as there will still be a small amount of 
interference from other unintended transmissions. 
Alternatively, with a single radio channel and non-CDMA 
nodes, simultaneous transmissions are possible only 
among spatially separated nodes. Since the energy costs 
and delay per transmission for these two types of nodes 
are quite different, we will consider energy*delay 
reduction for our data gathering problem separately for 
these two cases. 

In this paper we present two protocols for energy* delay 
reduction: a chain-based binary combining protocol that 
uses CDMA capable nodes and a 3 level hierarchy chain- 
based protocol for non-CDMA nodes. A chain is formed 
among the sensor nodes in both of these protocols so that 
each node will receive from and transmit to a close 
neighbor at the lowest level of the hierarchy. Gathered 



data move from node to node, get fused, and eventually a 
designated node transmits to the BS. Nodes take turns 
transmitting to the BS so that the average energy spent by 
each node per round is reduced. The binary scheme has a 
hierarchy of log N, with N equal to the number of nodes. 
The binary scheme would therefore have a delay of 7 +1 
(for transmitting to the base station) for 100 nodes and 
performs better than LEACH by a factor of 8. The 3 level 
hierarchy chain-based protocol has a higher delay but is 
better than the binary scheme with non-CDMA nodes. 
This is because in the binary scheme there are many 
nearby simultaneous transmissions at the lower levels and 
the interference will be very high. In the 3 level scheme, 
fewer and distant simultaneous transmissions take place 
causing less interference. This 3 level chain-based 
protocol performs better than the direct scheme by a 
factor of about 60. 

The paper consists of the following sections. In Section 
2, the radio model for energy calculations is discussed. In 
Section 3, an analysis of the energy* delay metric for data, 
gathering is given. Section 4 describes the chain-based 
binary approach using CDMA capable sensor nodes. 
Section 5 presents the chain-based 3 level scheme without 
CDMA capable sensor nodes. Experimental results are 
given in Section 6. Finally, Section 7 concludes the paper 
and proposes future work. 

2. Radio Model for Energy Calculations 

We use the same radio model as discussed in [3] which 
is the first order radio model. In this model, a radio 
dissipates Eei ec = 50 nJ/bit to run the transmitter or 
receiver circuitry and 6^ = 100 pJ/bit/m 2 for the 
transmitter amplifier. The radios have power control and 
can expend the minimum required energy to reach the 
intended recipients. The radios can be turned off to avoid 
receiving unintended transmissions. 

An r 2 energy loss is used due to channel transmission 
[8,1 1]. The equations used to calculate transmission costs 
and receiving costs for a A: -bit message and a distance d 
are shown below: 

Transmitting 

E Tx (k, d) = Erx- dec (k) + Erx-an^M) 

E Tx (k, d) = £i cc *k + eanp * k* d 2 
Receiving 

ERx(k) = ^lx-eiec(k) 
ERx(k) = ^iec*k 

Receiving is also a high cost operation, therefore, the 
number of receives and transmissions should be minimal. 

In our simulations, we used a packet length k of 2000 
bits. With these radio parameters, when d 2 is 500m 2 , the 
energy spent in the amplifier part equals the energy spent 
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in the electronics part, and therefore, the cost to transmit a 
packet will be twice the cost to receive. 

It is assumed that the radio channel is symmetric so that 
the energy required to transmit a message from node i to 
node j is the same as energy required to transmit a 
message from node j to node i for a given signal to noise 
ratio (SNR), typically 10 dB. When there are multiple 
simultaneous transmissions, the transmitted energy should 
be increased to ensure that the same SNR as with a single 
transmission is maintained. With CDMA nodes using 64 
or 128 chips per bit (which is typical) the interference 
from other transmissions are calculated as a small fraction 
of the energy from other unintended transmission. This 
effectively increases the energy cost to maintain the same 
SNR. With non-CDMA nodes, the interference will equal 
the amount of energy seen at the receiver from all other 
unintended transmitters. Therefore, only few spatially 
distant pairs can communicate simultaneously. 

3. Energy*Delay Analysis for Data 
Gathering 

In this section we will analyze the energy* delay cost 
per round for data gathering from a sensor web to the 
distant BS. Recall that the data collection problem of 
interest is to send a A; -bit packet from each sensor node in 
each round. Of course, the goal is to keep the sensor web 
operating as long as possible but minimize delay at the 
same time. A fixed amount of energy is spent in 
receiving and transmitting a packet in the electronics, and 
an additional amount proportional to cf is spent while 
transmitting a packet. There is also a cost of 5 
nJ/bit/message for data fusion. The delay cost can be 
calculated as units of time. On a 2Mbps link, a 2000 bit 
message can be transmitted in 1ms. Therefore each unit of 
delay will correspond to about 1 ms time for the case of a 
single channel and non-CDMA sensor nodes. The actual 
delay value will be different with CDMA nodes 
depending on the effective data rate. For each of the 
systems, we assume that the delay is 1 unit for each 2000 
bit message transmitted. 

The energy *delay cost for data gathering in a network 
of N nodes will be different for the schemes considered in 
this paper and will depend on the node distribution in the 
playing field. Consider the example network where the N 
nodes are along a straight line with equal distance of d 
between each pair of nodes and the BS at a faraway 
distance from all nodes. The direct approach will require 
high energy cost and the delay will be N as nodes transmit 
to the BS sequentially. The PEGASIS scheme [6], which 
is near optimal in terms of energy cost for this data 
gathering application in sensor networks, forms a chain 
among the sensor nodes so that each node will receive 
from and transmit to a close neighbor. For this linear 
network with equally spaced nodes, the energy cost in 



PEGASIS is minimized and the variable cost is 
proportional to N*d 2 and the delay will be N units. 
Therefore, the energy*delay cost will be N 2 *d 2 . 

In the binary scheme with perfect parallel transmission 
of data, there will be N/2 nodes transmitting data to their 
neighbors at distance d in the lowest level. The nodes that 
receive data will fuse the data with their own data and will 
be active in the next level of the tree. Next, N/4 nodes will 
transmit data to their neighbors at a distance 2d and this 
procedure continues until a single node finally transmits 
the combined message to the BS. Thus, for the binary 
scheme the energy cost will be 

N/2 * d 2 + N/4 * (2d) 2 + N/8 * (4d) 2 + . . .+1 * (N/2*d) 2 

since the distance doubles as we go up the hierarchy. In 
addition, there will be a single transmission to the BS and 
the energy cost depends on the distance to the BS. 
Without including this additional cost, simplifying the 
above expression we get for the energy cost for the binary 
scheme as 

N/2*d 2 *(l+2 + 4+...+ N/2), 

which equals 

N(N-l)/2*d 2 . 

With the delay cost of about log N units, the 
energy*delay for the binary scheme is N 2 /2*d 2 *logN. 
Therefore, for this linear network, the binary scheme will 
be more expensive than PEGASIS in terms of 
energy *delay. For random distribution of nodes in a 
rectangular playing field, the distances do not double as 
we go up the hierarchy in the binary scheme, and the 
reduced delay will help reduce the energy* delay cost. It is 
difficult to analyze this cost for randomly distributed 
nodes and we will use simulations to evaluate this cost. 

For the rest of the analysis, we assume a 100-node 
sensor network in a square field with the BS located far 
away. In this scenario, energy costs can be reduced if the 
data is gathered locally among the sensor nodes and only 
a few nodes transmit the fused data to the BS. This is the 
approach taken in LEACH [3], where clusters are formed 
dynamically in each round and cluster-heads (leaders for 
each cluster) gather data locally and then transmit to the 
BS. Cluster-heads are chosen randomly, but all nodes 
have a chance to become a cluster-head in LEACH, to 
balance the energy spent per round by each sensor node. 
Nodes are able to transmit simultaneously to their cluster- 
heads using CDMA. For a 100-node network in a 50m x 
50m field with the BS located at (25,150), which is at 
least 100m from the closest node, LEACH reduces the 
energy*delay cost compared to the direct scheme. For the 
linear network of N nodes that are equally spaced, 
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LEACH will have slightly higher energy compared to 
PEGASIS due to the cluster heads transmissions to the BS 
and a delay of roughly N/c where c is the number of 
clusters. With 5 clusters suggested in [3], the 
energy* delay for LEACH will be lower than for 
PEGASIS for a 50m x 50m network. However, for a 
100m x 100m network, the energy*delay for LEACH will 
be higher than for PEGASIS. 

4. A Chain-based Binary Approach using 
CDMA 

First, we consider a sensor network with nodes capable 
of CDMA communication. With this CDMA system, it is 
possible for node pairs that communicate to use distinct 
codes to minimize radio interference. Thus, parallel 
communication is possible with 50 pairs for the 100-node 
network of interest. In order to minimize the delay, we 
will combine data using as many pairs as possible in each 
level which results in a hierarchy of log N levels. At the 
lowest level, we will construct a linear chain among all 
the nodes, as was done in PEGASIS, so that adjacent 
nodes on the chain are nearby. For constructing the chain, 
we assume that all nodes have global knowledge of the 
network and employ the greedy algorithm. The greedy 
approach to constructing the chain works well, and this is 
done before the first round of communication. To 
construct the chain, we start with the furthest node from 
the BS. We begin with this node in order to make sure 
that nodes farther from the BS have close neighbors. As 
in the greedy algorithm the neighbor distances will 
increase gradually since nodes already on the chain 
cannot be revisited. 

For gathering data in each round, each node transmits to 
a close neighbor in a given level of the hierarchy. This 
occurs at every level in the hierarchy, but the only 
difference is that the nodes that are receiving at each level 
are the only nodes that rise to the next level. Finally, at 
the top level the only node remaining will be the leader, 
and the leader will transmit the 2000 bit message to the 
BS. Note that node / will be in some random position j on 
the chain. Nodes take turns transmitting to the BS, and we 
will use node number / mod N (N represents the number 
of nodes) to transmit to the BS in round /. In Figure 2, for 
round 3, node c3 is the leader. Since, node c3 is in 
position 3 (counting from 0) on the chain, all nodes in an 
even position will send to their right neighbor. Now at the 
next level, node c3 is still in an odd position so again, all 
nodes in an even position will , fuse its data with its 
received data and send to their right. At the third level, 
node c3 is not in an odd position, so node c7 will fuse its 
data and transmit to c3. Finally, node c3 will combine its 
current data with that received from c7 and transmit the 
message to BS. The chain-based binary scheme performs 
data fusion at every node that is transmitting except the 



end nodes in each level. Each node will fuse its 
neighbor's data with its own to generate a single packet of 
the same length and then transmit that to the next node. 
In the above example, node cO will pass its data to node 
cl. Node cl fuses node cO's data with its own and then 
transmits to node c3 in the next level. In our simulations, 
we ensure that each node performs equal number of sends 
and receives after N rounds of communication, and each 
node transmitting to the BS in one of N rounds. We then 
calculate the average energy cost per round, while the 
delay cost is the same for each round. 



BS 

T 
c3 
c3 <-c7 
cl-»c3 c5->c7 
c0->cl c2->c3 c4-^c5 c6-»c7 



Figure 2. Data gathering in a chain-based binary scheme. 

The chain-based binary scheme improves on LEACH 
by saving energy and delay in several stages. At the lower 
levels, nodes are transmitting at shorter distances 
compared to nodes transmitting to a cluster head in the 
LEACH protocol, and only one node transmits to the BS 
in each round of communication. By allowing nodes to 
transmit simultaneously, the delay cost for the binary 
scheme decreases from that of LEACH by a factor of 
about 3. While in LEACH, only 5 groups can transmit 
simultaneously, here at each level, we have multiple 
nodes transmit simultaneously. At each level of the binary 
scheme, transmissions are simultaneous making the total 
delay log N +1 for transmitting to the BS. In LEACH, the 
delay for 100 node networks will be 27 units. The delay 
for all nodes to transmit to the cluster-head is the max 
number of nodes in any of the 5 clusters. If all the clusters 
are of the same size, then the delay would be 19. Then all 
5 cluster-heads must take turns to transmit to the BS, 
making that a total of 24. For overhead calculations, we 
have 1 unit of delay for cluster formation, 1 unit of delay 
for all nodes to broadcast to the cluster-head its presence 
in that cluster, and finally 1 unit of delay for the cluster- 
head to broadcast a TDM A schedule to the nodes so that 
nodes will know when to broadcast to the cluster-head. 
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5. A Chain-based 3 Level Scheme without 
CDMA 

CDMA may not be applicable for all sensor networks 
as these nodes can be expensive. Therefore, we need a 
protocol that will achieve a minimal energy* delay with 
non-CDMA nodes. It will not be possible to use the 
binary scheme in this case as the interference will be too 
much at lower levels. We either have to increase the 
energy cost significantly or take more time steps at lower 
levels of the hierarchy both of which will lead to much 
higher energy* delay cost. Therefore, in order to improve 
energy*delay we need a protocol that allows simultaneous 
transmissions that are far apart to minimize interference 
while achieving reasonable delay cost. 

Based on our experiments, we suggest the chain-based 
3 level scheme for data gathering in sensor networks with 
non-CDMA nodes. In the 3 level scheme also, we start 
with the linear chain among all the nodes and divide them 
into 10 groups. In the 100-node network, therefore, only 
10 simultaneous transmissions take place at the same 
time, and data fusion takes place at each node (except the 
end nodes in each level). The transmissions are also far 
enough apart that there is minimal interference. Figure 3 
"shows an example of this scheme with 100 nodes. Here 
we would have 10 groups of 10. We will have a different 
leader each round transmit to the BS to evenly distribute 
the work load among the sensor nodes. We find the index 
/' which will represent the leader position modulo 10. In 
Figure 3, c!8 is our leader. Then all nodes will send their 
data in the direction of index 8 within their group since 1 8 
modulo 10 is 8. The delay at the first level is 9 units. 
Then the second level will contain nodes c8, cl8, 
c28...c98. These 10 nodes will be divided into two 
groups. Since the leader position is 18, all nodes that are 
in the first group will send down the chain 10 positions 
from its own position on the chain. So node c48 will send 
to node c38, and node c38 will send to node c28 and so 
on. Since node c8*s position is less than node cl8's, node 
c8 will transmit to a position that is 10 greater than its 
own. In group 2, nodes know in which direction to send 
the data using the leader position + 50. So here, the nodes 
in group 2 would send in the direction of node c68 in the 
same manner as in group 1 . This gives us a delay of 4 
units for the second level. In the third level, node c68 
transmits to node cl8, and then finally node cl8 transmits 
to the base station, giving us a total delay of 15 units. The 
transmission schedule can be programmed once at the 
beginning so that all nodes know where to send data in 
each round of communication. 



BS 
T 

cl8 
cl8<-c68 

c g ->c]&-c2&-c3&-c48 c58-> c68<-c78<-c88<-c98 
c 0-*cl . . x7-> c8<-c9 cl0->cl 1 . . ,cl8<-cl9. . .c90-*91 . . .c98<- c99 

Figure 3. Chain-based 3 level scheme for a sensor 
network with non-CDMA nodes. 

6. Experimental Results 

To evaluate the performance of the chain-based binary 
scheme and the chain-based 3 level scheme, we simulated 
direct transmission, PEGASIS, LEACH, and the two new 
schemes using several random 100-node networks with 
CDMA nodes and non-CDMA nodes. The BS is located 
at (25, 150) in a 50m x 50m field, and the BS is located at 
(50,300) in a 100m x 100m field. We ran the simulations 
to determine the energy cost for all the schemes after all 
100 nodes had a chance to become leader, with each node 
having the same initial energy level. We then used these 
costs to determine the average energy cost per round of 
data gathering. In both CDMA and non-CDMA systems, 
we included the interference costs when there are 
simultaneous transmissions to ensure that the same SNR 
of 10 dB is maintained as with single transmission. Our 
simulations show: 

• The chain-based binary scheme is approximately 
8x better than LEACH and 130x better than direct 
for a 50m x 50m network in terms of energy* delay 
for sensor networks with CDMA nodes. 

• The chain-based binary scheme is approximately 
12x better than LEACH and 280x better than direct 
for a 100m x 100m network in terms of energy* 
delay for sensor networks with CDMA nodes. 

• The chain-based 3 level scheme is approximately 
4x better than PEGASIS and 60x better than direct 
for a 50m x 50m network in terms of energy* delay 
for sensor networks with non-CDMA nodes. 

• The chain-based 3 level scheme is approximately 
4x better than PEGASIS and 140x better than 
direct for a 100m x 100m network in terms of 
energy* delay for sensor networks with non- 
CDMA nodes. 

• A more balanced energy dissipation among the 
sensor nodes to have full use of the complete 
sensor network. 

These results are summarized in Table 1 and Table 2. 
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Table 1. Energy* delay cost for Direct, PEGASIS, 
LEACH, chain-based binary scheme and the chain-based 
3 level scheme. These results are for a 50m x 50m 
network. 



Protocol 


Energy 


Delay 


Energy* 
Delay 


Direct 

(both 

systems) 


0.32993 


100 


32.9938 


PEGASIS 

(both 

systems) 


0.024008 


100 


2.4008 


LEACH 
(CDMA 
nodes) 


0.079696 


27 


2.1518 


Chain-based 
binary 
(CDMA 
nodes) 


0.031847 


8 


0.2547 


Chain-based 
3 level (non- 
CDMA 
nodes) 


0.035772 


15 


0.5365 



Table 2. Energy* delay cost for Direct, PEGASIS, 
LEACH, chain-based binary scheme and the chain-based 
3 level scheme. These results are for a 100m x 100m 
network. 



Protocol 


Energy 


Delay 


Energy 
*Delay 


Direct 

(both 

systems) 


1.280459 


100 


128.0459 


PEGASIS 

(both 

systems) 


0.036107 


100 


3.6107 


LEACH 
(CDMA 
nodes) 


0.204786 


27 


5.5292 


Binary 

(CDMA 

nodes) 


0.055898 


8 


0.4516 


3 Level 

(non-CDMA 

nodes) 


0.058287 


15 


0.8743 



7. Conclusions and Future Work 

In this paper, we describe two new protocols for 
energy*delay reduction for data gathering in sensor 
networks - a chain-based binary scheme for sensor 
networks with CDMA nodes and a chain-based 3 level 



scheme for sensor networks with non-CDMA nodes. The 
binary scheme performs better than direct, PEGASIS, and 
LEACH. It performs better than LEACH by a factor of 
about 8, about 10 times better than PEGASIS, and more 
than 100 times better when compared to the direct 
scheme. In these experiments, the interfering 
transmissions contribute 1/128 the value of their 
transmission energy. The chain-based 3 level scheme with 
non-CDMA nodes outperforms PEGASIS by a factor of 4 
and is better than direct by a factor of 60. This chain 
based scheme outperforms PEGASIS by dividing the 
chain into "groups" and allowing simultaneous 
transmissions among pairs in different groups. While 
energy is still minimal, the delay is decreased from 100 
units to 15 units. 

It is not clear as to what is the optimal scheme for 
minimizing energy*delay in a sensor network. Since the 
energy costs of transmissions depend on the spatial 
distribution of nodes, there may not be a single scheme 
that is optimal for all sizes of the network. Our 
preliminary experimental results indicate that for all small 
networks, the binary scheme performs best as minimizing 
delay achieves best result for energy* delay. With larger 
networks, we expect that nodes in the higher levels of the 
hierarchy to be far apart and it is possible that a different 
multi-level scheme may outperform the binary scheme. 
When using non-CDMA nodes, interference effects can 
be reduced by carefully scheduling simultaneous 
transmissions. Since there is an exponential number of 
possible schedules, it is intractable to determine the 
optimal scheduling to minimize energy*delay cost. A 
practical scheme to employ will depend on the size of the 
playing field and the distribution of nodes in the field. 

In this paper, we restricted our discussions to the cf 
model for energy dissipation for wireless 
communications. In our future work, we will consider 
higher order energy dissipation models and develop 
schemes to minimize energy* delay cost for data gathering 
application. 
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Abstract 

To reduce interference in a 
communication system (10), 
a communication unit (42-50) 
is arranged to initiate 
establishment of a radio 
frequency communication 
with a base station (26-38) 
by transmitting a system 
access request on a 
dedicated, wide area control 
channel. Upon receipt of the 
system access request, a 
base station (32) of the 
communication system of 
FIG. 1 responds by forming a 
narrowbeam control channel 
to the communication unit 
and transmitting system 
control information to the 
communication unit on the 
narrowbeam control channel, 
the system control 
information transmitted from 
the array of antenna 
elements and arranged to 
identify a narrowbeam 
communication resource for 
use in the radio 
communication. The 
communication unit (42-50), 
upon receiving the system 
control information, then 
configures itself to utilise the 
narrowbeam communication 
resource for the radio 
communication. 
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Claims 

What is claimed is: 

1 . A method of establishing radio 
communication between a communication unit 
and a base station having an array of antenna 
elements, the method comprising the steps of: 

receiving a system access request at each 
base station of a plurality of base stations, 
wherein each base station of the plurality of 
base stations comprises an antenna array; 
making signal parameter measurements of 
the access request received by each base 
station of the plurality of base stations; 
determining a rank order of signal parameter 
measurements with respect to the plurality of 
base stations; 

selecting a base station to serve the 
communication unit based on the rank order 
to produce a serving base station; 
at the serving base station, in response to the 
received system access request, forming a 
first narrowbeam control channel to the 
communication unit and transmitting system 
control information to the communication unit 
on the first narrowbeam control channel, the 
system control information transmitted from 
the array of antenna elements and arranged 
to identify a narrowbeam communication 
resource for use in the radio communication; 
at the serving base station, receiving a 
request for assignment of a new narrowbeam 
control channel; 

in response to the request for assignment of a 
new narrowbeam control channel, instructing 
at least one non-serving base station of the 
plurality of base stations to prepare to 
transmit narrowbeam control channels at the 
communication unit; 

at the at least one non-serving base station of 
the plurality of base stations, in response to 
receiving an instruction to prepare to transmit, 
notifying the serving base station of channel 
assignment information pertaining to a 
subsequent transmission of the narrowbeam 
control channels at the communication unit; 
and 

at the serving base station, in response to 
receiving the channel assignment information 
from the at least one non-serving base 
station, notifying the communication unit of 
the channel assignment information on the 
first narrowbeam control channel. 
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2. The method of claim 1, further comprising the 
step of, at the serving base station, periodically 
altering a beam pattern of the first narrowbeam 
control channel. 

3. The method of claim 2, wherein the beam 
pattern of the first narrowbeam control channel 
is oscillated about an expected position of the 
communication unit. 

4. The method of claim 2, wherein a width of the 
beam pattern of the first narrowbeam control 
channel is pulsed. 

5. The method of claim 2, further comprising the 
step of, at the serving base station, transmitting 
beam pattern information on the first 
narrowbeam control channel identifying how the 
beam pattern of the first narrowbeam control 
channel is altering. 

6. The method of claim 1 , further comprising the 
steps of: 

determining a base station handoff candidate 
from among the at least one non-serving 
base stations; and 

receiving, by the base station handoff 
candidate and via a narrowbeam control 
channel associated with the base station 
handoff candidate, a signal to initiate handoff 
of the radio communication. 

7. The method of claim 1 , further comprising the 
steps of: 

storing base station location information and 
communication unit location information; and 
by each base station of the plurality of base 
stations, accessing the base station location 
information and the communication unit 
location information to beamform, during 
handoff of the radio communication, a 
narrowbeam control channel in a direction of 
the communication unit. 

8. A radio communication system for supporting 
radio communication between a communication 
unit and at least one of a plurality of base 
stations, the system comprising: 

a plurality of base stations, wherein each 
base station of the plurality of base stations 
comprises: 
an array of antenna elements; 
a means, responsive to the array of 
antenna elements, for receiving and 
processing a system access request; 
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a means, responsive to the system access 
request, for assigning and generating 
system control information identifying a 
narrowbeam communication resource for 
use in a radio communication with the 
communication unit; 

a means, coupled to the array of antenna 
elements, for forming and transmitting a 
first narrowbeam control channel to the 
communication unit in response to the 
received system access request; 
a means for making signal parameter 
measurements of the access request 
received at each base station of the plurality 
of base stations; 

a means for determining a rank order of 

signal parameter measurements with respect 

to the plurality of base stations; 

a means for selecting a serving base station 

to serve the communication unit from the rank 

order; 

a means at the serving base station for 
receiving a request for assignment of a new 
narrowbeam control channel; 
a means for instructing, in response to the 
request for assignment of a new narrowbeam 
control channel, at least one non-serving 
base station of the plurality of base stations to 
prepare to transmit narrowbeam control 
channels at the communication unit; 
a means at the at least one non-serving base 
station of the plurality of base stations for 
notifying, in response to receiving an 
instruction to prepare to transmit, the serving 
base station of channel assignment 
information pertaining to a subsequent 
transmission of the narrowbeam control 
channels at the communication unit; and 
a means at the serving base station for 
notifying, in response to receiving the channel 
assignment information from the at least one 
non-serving base station, the communication 
unit of the channel assignment information on 
the first narrowbeam control channel. 

9. The communication system of claim 8, 
wherein the serving base station further 
comprises means for periodically altering a 
beam pattern of the first narrowbeam control 
channel. 
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Abstract. Future mobile communication systems can support not only 
voice but also multimedia applications such as data, image and video. It 
requires greater resources than voice-oriented mobile system. If handoff 
events are occurred during the transmission of multimedia, the efficient 
resource allocation and handoff procedures are necessary to maintain 
the same QoS of transmitted multimedia traffic because the QoS may 
be defected by some delay and information loss. This paper proposes a 
resource reservation and allocation scheme to accommodate multimedia 
traffic based on the direction estimation in mobile multimedia networks. 
This scheme estimates the position of mobiles based on a two step estima- 
tion comprised of sector estimation, zone estimation. With the position 
information, the moving direction is determined. 

1 Introduction 

The explosive growth of Internet access in parallel with the technological ad- 
vances in mobile communications has motivated mobile computing and multi- 
media applications in wireless mobile networks. A Key characteristic of multi- 
media services is that they require different Quality of Service (QoS) guarantees. 
Due to the limitations of the radio spectrum, the wireless systems use micro- 
cellular architectures in order to provide a higher capacity. Because of small 
coverage area of micro-cells, network resources availability varies frequently as 
users move from one access point to another [1]. In order to deterministically 
guarantee QoS support for a mobile unit, the network must have prior exact 
knowledge of the mobile's mobility. Majority of the existing schemes to sup- 
port mobility make a reservation for resources in adjacent cells. However these 
techniques cause a waste of resources since it is regardless of the direction of 
mobiles. Also, existing methods for predicting and reserving resources for future 
handoff calls do not seem to be suitable for mobile multimedia networks. The 
amount of resources required to successfully perform handoff may vary arbitrar- 
ily over a wide range in a mobile multimedia network. For example, data and 
video applications may adapt to different service quality levels and consequently 

P.M. A. Sloot et al. (Eds.): ICCS 2003, LNCS 2660, pp. 555-565, 2003. 
© Springer- Verlag Berlin Heidelberg 2003 
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may accept different levels of resources in order to ensure a successful handoff 
[2] [3] [4]. In this paper, we consider a mobile network supporting diverse traf- 
fic characteristics of voice, data, and video applications. Since the connections 
can now differ in the amount of resources required to meet their QoS needs, the 
question is how should a base station dynamically adapt the amount of resources 
reserved for dealing with handoff requests. In this paper, the handoff requests 
for real-time connections are handled based on the direction prediction and the 
resource reservation scheme. The resources in the estimated adjacent cells should 
be reserved to guarantee the continuity of the real-time connections. If handoff 
requests are occurred during the transmission of multimedia traffic, the efficient 
resource allocation and handoff procedures are necessary to maintain the same 
QoS of transmitted multimedia traffic because the QoS may be defected by some 
delay and information loss. This paper proposes a handoff scheme to transmit 
multimedia traffic based on the resource reservation using direction estimation. 



2 Direction Prediction Method 

Figure 1 shows how our scheme divides a cell into many zones based on the 
signal strength, and then estimates the optimal zone stepwise where the mobile 
is located. This process is based on a two step location estimations which deter- 
mines the mobile position by gradually reducing the area of the mobile position 
[5] [6]. This scheme is implemented as an estimator into the base station. The 
estimator is started with a timer, and then the estimation is performed sequen- 
tially in two steps. The estimator first estimates the location sector in the sector 
estimation step, then estimates the location zone in the zone estimation step, 
and then finally estimates mobile's direction. 

2.1 Sector Estimation 

The sector estimation, the first step of the location estimation, is done in the 
following procedure. 



A. All the neighboring base stations transmit pilot signals periodically. 

B. The demodulator of the mobile measures PSSs of neighboring base stations. 

C. The mobile sends PSMM (Pilot Strength Measurement Message) to the base 
station. 

D. The estimator in the base station compares the received strengths of the 
pilot channels with each other and chooses the sector neighboring to the 
base station of the greatest signal strength as the sector at which the mobile 
locates. 

E. The sector number is registered to the object information. 
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2.2 Zone Estimation 

Each cell is divided into n zones with each zone classified by PSS. The following 
LOS algorithm summarizes the zone estimation procedure for LOS model. 

1. Select each threshold considering PSS. 

2. In order to map the signal strength onto the direction information, determine 
the distance function for each threshold with Equation(l). 



In Equation (1) D indicates the distance between two base stations, 
and d the distance between the base station A and the mobile, fei is 
proportional to the transmission power of the station and k has the offset 
value depending the radio propagation environment. Two random signals 
u(t), v(t) which indicate the power distributions of signals received at the 
distance d respectively from the station A and from the station B have 
i.i.d (identical independent distribution) with Gaussian distribution of 
N(fi(d),a) . The average value of the received signal at the specific location, 
is determined by the path-loss component proportional to the distance 
and a is assumed to be same. The changes in the LOS are depicted by k 2 . 

3. Classify zones using the distance function. 

4. Assign the zone number and the PSS threshold to all divided zones. 




Fig. 1. Sector, zone and a mobile's moving direction 



P A {d) = k l -k 2 {d) + u(t) 
P B {d) = k 1 -k 2 (D-d)+v(t) 



(1) 
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Using the sector of blocks selected in the sector estimation, the zone estimation, 
the second step of the location estimation, estimates the zone of blocks at one 
of which the mobile locates. It is done in the following procedure. 

A. The base station transmits the pilot signal periodically. 

B. The demodulator of the mobile measures the signal strength of the pilot 
channel of the base station in which it is. 

C. The mobile sends PSMM to the base station. 

D. The estimator estimates the zone using the LOS algorithm. 

E. The estimated zone number is registered to a zone object. 

2.3 Direction Estimation 

The estimator estimates a sector in the sector estimation step and a zone in the 
zone estimation step respectively, and then finally computes mobile's direction 
using the vector information between the estimated location and the previous 
one. In order to indicate the location of each zone within a cell, we use the 
vector data which is obtained by converting the rectangular coordinate of the 
zone to the polar coordinate with the origin of the base station [5]. Each vector 
has the information on a distance and an angle. The polar coordinate indicates 
the location by the distance from the origin to the mobile and the angle from 
the positive horizontal axis toward counter-clockwise. In our study we need the 
direction information to identify the sector relative to the base station and the 
relative position of a zone so we use the polar coordinates converted from the 
rectangular coordinates. 

A mobile's direction is classified based on tHe movement into the upper 
zone (dll, dl2 and dl3), lower one (dl6, dl7 and dl8) and the same one (dl4 
and dl5), as shown in Figure 1. The moving radius of a mobile toward the 
upper from the lower zone becomes wider, while the moving radius of a mobile 
toward the lower zone becomes narrower. That is, the direction prediction for a 
mobile toward his BS is difficult more than mobiles toward his cell boundary. 
Therefore, for a mobile toward the upper zone it is efficient to increase the 
number of cells within his moving radius that is expected to be handed off, 
and to decrease the number of cells in case of a mobile toward the lower zone. 
Figure 2 shows the direction estimation algorithm based on the above conditions. 

2.3.1 Resource Reservation for Low-Speed Mobiles 

The moving radius and the moving pattern of a mobile has different charac- 
teristics according to the speed of the mobile. That is, a low-speed mobile (a 
pedestrian) has a smaller moving radius and a more complex moving pattern, 
while a high-speed mobile (a motor vehicle) has a larger radius and a simpler 
pattern. Using those characteristics, reservation variables such as the current 
position and the moving direction of a mobile are defined, and the neighboring 
cells that need to reserve resources are decided. In the case of low-speed mobiles, 
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Fig. 2. Direction estimation algorithm 



the result of the direction estimation can vary significantly due to two factors. 
First, the current position is considered. This is because the moving radius of 
low-speed mobiles is narrow, therefore, whether the handoff is done or not can 
be estimated according to his current position. Second, the moving direction is 
considered. The reason of using this factor is that the handoff attempt of the 
mobiles moving toward the inside of cell is decreased, while the handoff attempt 
of the mobiles moving toward the outside of cell is increased. The location zone 
is estimated in the sector and the zone estimation, and then based on the loca- 
tion of the estimated zone and the predicted direction, cells in the neighborhood 
are ranked according to the likelihood that the mobile will move into these cells. 
Cells needed to reserve the resource is decided by the sector estimation, and the 
resource reservation for the cells is done using the current position estimated by 
the zone estimation. Mobiles moving toward the upper zone need not reserve the 
resource. The resources for mobiles moving toward the lower zone are reserved 
only in case their estimated position is zone-3. Resource reservation conditions 
based on the reservation variable for a low-speed mobile is as follows. 

- Condition 1: if the current position is zone-1, the reservation is not made 
regardless of its moving direction. 
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- Condition 2: if the current position is zone-2 and its movement is done from 
zone-1, the reservation is not made. 

- Condition 3: if the current position is zone-2 and its movement is done from 
zone-3, the reservation is made to the maximum two cells. 

- Condition 4: if the current position is zone-3 and its movement is done from 
zone-2, the reservation is made to the maximum one cell. 



2.3.2 Resource Reservation for High-Speed Mobiles 

The reservation variable for fast-speed mobiles is the moving direction. If the 
mobiles move fast, the feasibility of performing a handoff is expected to be higher; 
therefore the reservation is needed regardless of their current position within cell. 
Since the mobility of the fast-speed mobiles has a varying randomness, a cluster 
of cells that is reflective of the user mobility is needed to reserve the resources. 
Resource reservation conditions based on the reservation variable for a high- 
speed mobile is as follows. 

- Condition 1: If the mobile moves from zone-1 to zone-2 within the same 
sector, the resources are reserved for three cells that is reflective of the moving 
direction. 

- Condition 2: If the mobile moves from zone-2 to zone-3 within the same 
sector, the resources are reserved for one cell that is reflective of the moving 
direction. 

- Condition 3: If the mobile moves from zone-3 to zone-2 within the same 
sector, the resources are reserved for five cells that is reflective of the moving 
direction. 

- Condition 4: If the mobile moves from zone-1 to zone-2 within the same 
sector, the resources are reserved for three cells that is reflective of the moving 
direction. 

- Condition 5: If the mobile moves from zone-1 to zone-2 within the other 
sector, the resources are reserved for three cells that is reflective of the moving 
direction. 

- Condition 6: If the mobile moves from zone-2 to zone-3 within the other 
sector, the resources are reserved for one cell that is reflective of the moving 
direction. 

- Condition 7: If the mobile moves from zone-3 to zone-2 within the other 
sector, the resources are reserved for five cells that is reflective of the moving 
direction. 

- Condition 8: If the mobile moves from zone-2 to zone-1 within the other 
sector, the resources are reserved for five cells that is reflective of the moving 
direction 
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3 Direction Prediction Based Resource Reservation and 
Allocation 

3.1 Resource Reservation and Allocation Structure 

A real-time mobile performs resource reservation for hand-off, and a set of the 
reserved resources can be occupied temporally by non-real-time mobiles within 
the target cell. On the order hand, a non-real-time mobile doesn't perform re- 
source reservation for handoff, and its resource allocation request is buffered in 
the waiting queue of the target BS during handoff duration time and is given 
the priority based on the service demand time. If the reserved set is returned 
because the corresponding real-time mobile is handed off, the priority of the 
request becomes the lowest lank. This strategy is explained in Figure 3. 



Allocate 
The reserved resource 




Allocate The resource 



in a pop -real-time queue 



Fig. 3. Admission control for multimedia connections 



3.2 Resource Reservation Procedure 

The base station reserves only the resources corresponding to the minimum 
transmission rate to the mobile. Based on the location and the direction of 
the mobile within a cell, the resource reservation is performed with the follow- 
ing order: unnecessary state, not necessary state, necessary state, and positively 
necessary state. If the reservation variable for the mobile is changed, the reserva- 
tion is canceled and the resources have to be released with the reverse order and 
returned to the fool of available resource. The set of the reserved resources have 
its priorities depending on whether it can be allocated to new connections or not: 
a real-time handoff connection (priority 1), a non-real-time handoff connection 
(priority 2) and a non- real- time new connection (priority 3). 
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— Unnecessary State 

The resource reservation needs not be performed. 

This state corresponds to the resource reservation condition 1 and 2 for 
low-speed mobiles. 

- Not Necessary State 

A set of the reserved resources corresponds to priority 3. 
If any resources are available in each of the estimated cells, the resources 
are then reserved for each of the mobiles. A set of the reserved resources can 
be occupied by the new connections if enough resources are not available 
for a new connection in each of the estimated cells. 

If there are any resources available to support the reservation in the 
estimated cell, and a moving connection competes with a new connection 
for the resources, the resources are occupied with the following order: a 
real-time handoff connection, a real-time new connection, a non-real-time 
handoff connection, and a non-real-time new connection. 
If no resources are available, the reservation is not done. 
This state corresponds to the resource reservation condition 3, 7 and 8 for 
fast-speed mobiles. 

— Necessary State 

A set of the reserved resources corresponds to priority 1. 
If there is no enough resource available to accommodate a new connection, 
a set of the reserved resources for real-time handoff connections can be 
occupied by non-real-time new connections. 

If there are resources available to support the reservation in the estimated 
cell, and a moving connection competes with a new connection for the 
resources, the order of occupying the resources is the same as Not Necessary 
State. 

If no resources are available for the reservation in the estimate cell, the 
shared part resources are allocated and reserved for a real-time connection. 
This state corresponds to the resource reservation condition 4 for low-speed 
mobiles. 

This state corresponds to the resource reservation condition 1, 4, 5 and 8 
for fast-speed mobiles. 

- Positively Necessary State 

The reserved resources correspond to priority 1. 

New connections cannot occupy the reserved resources. 

In case of a moving connection competes with a new connection for resources 

in the estimated cell, the resources are occupied with the following order: a 

real-time handoff connection, a non-real-handoff connection, a real-time new 

connection and a non-real-time new connection. 

If no resources are available for the reservation in the estimate cell, the shared 
part resources can be allocated and reserved for both real-time connections 
and non- real-time connections. 



Resource Reservation and Allocation Based on Direction Prediction 



563 



This state corresponds to the resource reservation condition 4 for low-speed 
mobiles. 

This state corresponds to the resource reservation condition 2 and 6 for 
fast-speed mobiles. 

4 Simulation Model and Result 

The proposed scheme is compared with two different methods to evaluate the 
performance. 

Method 1: there is resource reservation. The resources are reserved exclusively 
for handoff connections in each cell. The remaining resources can be equally 
shared among handoff and new connections. This method is called FixedJles. 

Method 2: the resources are reserved dynamically based on the current con- 
nections in the neighboring cells. This method is called Dynamic -Res. 
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Fig. 4. Comparison of resource 
utilization 
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delay 



The simulation model composed of a single cell, which will keep contact with 
its six neighboring cells. Each cell contains a base station, which is responsi- 
ble for the connection setup and tear-down of new applications and to serve 
handoff applications. We consider the following simulation parameters regarding 
the received signal strength. The mean signal attenuation by the path-loss is 
proportional to 3.5 times the propagation distance, and the shadowing has a 
log-normal distribution with a standard deviation of a = 6dB. A value of the 
received signal strength less than -16 dB is regarded as an error, which is there- 
fore excluded from the calculation. Figure 4 shows the results for an average 
percentage of resource utilization as a function of connections arrival with pri- 
ority to handoff connections over new connections. The resource utilization for 



564 J. Lee, H. Kim, and K.J. Kim 




1 6 11 16 21 26 

Number of connections 



Fig. 6 Comparison of aggregated data loss rate 

the Direction- based is increased up to 25 arrival, than that for Fixed and Dy- 
namic. This improvement may be caused since the Direction-based allocates the 
reserved resources not only real-time handoff connections but also non-real-time 
new connections. In Figure 5, the comparison of transmission delay of the three 
schemes is plotted against the number of connections. From the figure, we can see 
the performance of the direction-based is up to 2.5 times better than that of the 
conventional schemes. This is because non-real-time connections can adaptively 
occupy the reserved resources for real-time connections, and we can prevent the 
performance degradation due to queueing of non-real-time connections. 

Figure 6 shows the aggregated data loss rate. It is observed that as the num- 
ber of connections increase, direction-based provides a noticeable improvement 
over the conventional schemes for real-time connections, while slightly degrading 
the performance for the non-real-time connections. 

5 Conclusion 

This main paper is to address the problem of guaranteeing an acceptable level of 
QoS requirements for mobile users as they move from one location to another. 
This is achieved through reservation variables such as the current location and 
the moving direction that is presented with a set of attributes that describes the 
user mobility. In this scheme, mobiles are classified according to their reservation 
variables. Based on reservation variables a scheme that provides predictive QoS 
guarantees in mobile multimedia networks is proposed. The proposed scheme 
shows a great improvement of the resource utilization, the delay and the data 
loss. It is because our resource reservation scheme is more adaptive than exist- 
ing resource reservation schemes. In our scheme, resources are classified as ones 
having priority to the new calls and ones having priority to the handoff calls 
based on reservation variable. We improve the dropping rate for the handoff 
connections by dynamically adjusting the amount of the reserved resources ac- 
cording to the amount of occupied resources. The determination of the optimal 
direction should be studied consecutively. Also further researches are required 
on their implementation and applications to the handoff and resource allocation 
strategies. 
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Abstract. In this paper, we describe a novel approach to mobile station positioning using a GSM mobile phone. 
The approach is based on the use of an inherent feature of the GSM cellular system (the mobile phone continuously 
measures radio signal strengths from a number of the nearest base stations (antennas)) and on the use of this 
information to estimate the phone's location. The current values of the signal strengths are processed by a trained 
artificial neural network executed at the computer attached to the mobile phone to estimate the position of the 
mobile station in real time. The neural network configuration is obtained by using a genetic algorithm that searches 
the space of specific neural network types and determines which one provides the best location estimation results. 
Two general methods are explored: the first is based on using a neural network for classification and the second 
uses function approximation. The experimental results are reported and discussed. 

Keywords: cellular networks, positioning, artificial neural networks. 



1. Introduction - Motivation for Research 



The positioning systems using wireless communication can be traced back to the PULSE 
conference of 1968 [1], whose focus was on Automatic Vehicle Monitoring. Since then, many 
positioning systems have been proposed and used, and they will be described in Section 2. The 
advance of technology has changed the heavy, big systems into light-weight systems such as 
the Global Positioning System (GPS). 

The development of positioning systems primarily focuses on the positioning of vehicles 
and, therefore, they are called Automatic Vehicle Monitoring (AVM) or Automatic Vehicle 
Location (AVL) systems. Although new systems such as the GPS system are used in position- 
ing vehicles, they are not restricted to this application because of their small size, portability, 
and reasonable (low) cost. 

Safety is the primary motivation for vehicle location. In the United States, the Federal 
Communications Commission (FCC) has adopted a Report and Order and Further Notice of 
Proposed Rule (NPRM) that creates rules to govern the availability of basic 91 1 services and 
the implementation of E911 (Enchanced-91 1) for wireless services. For basic 911 service, 
the Order requires all cellular, broadband PCS, and certain SMR licenses to transmit all 91 1 
calls made from mobile handsets that have a code identification to a Public Safety Answering 
Point (PSAP) without any blocking or validation procedures. The Order also requires these 
carriers to provide certain E91 1 features which enable the PSAP to identify the location of the 
caller, including Automatic Number Identification (ANI) and Automatic Location Identifica- 
tion (ALI) within the required timetable [2]. Target accuracy is the ability to locate in latitude 
and longitude a wireless caller within 125 meters Root Mean Square (RMS). 
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While safety is the main motivation for wireless position location, other promising ap- 
plications include navigation services, automated billing, fraud detection, roadside assistance, 
cargo tracking, and fleet management. Position location systems will provide new services and 
revenue sources for wireless carriers, greater crime-fighting capabilities for law enforcement 
personnel, and new methods for tracking people and parcels [3]. 

This paper presents an attempt at applying a new approach towards the positioning of a 
GSM-based mobile station. It uses radio signal strengths from serving and neighboring base 
stations that are continuously measured in each mobile station and applies them to a trained 
artificial neural network for positioning. As such, positioning is performed instantaneously 
(in real time). The paper is divided into the following sections. Section 2 presents a brief 
overview of the positioning approaches and explains the reasons for our approach based on 
the use of only signal strengths. Section 3 introduces the "Automatic GSM-based Position- 
ing and Communication System" (AGPCS), which represents a context for the application 
of our positioning models. The AGPCS provides a framework to integrate positioning with 
communication in order to achieve an infrastructure for a number of exciting applications. 
A brief overview of artificial neural network features, which are used for positioning, and 
genetic algorithm features, which are used for the selection of the neural network models 
applied for positioning, is given in Section 4. Two categories of models are developed and 
presented in Sections 5 and 6. A classification model is used to try to determine the area in 
which the mobile station lies. The function approximation models are used to estimate mobile 
station position in two-dimensional space by estimating distances and/or angles of the mobile 
station from a number of base stations or to estimate directly two-dimensional coordinates of 
the mobile station. Experimental results of the application of developed models are presented 
in Section 7. The best positioning accuracy using the direct positioning method with ANN 
as the function approximator and adjusted ANN weights results in an average distance error 
just above 200 m. Section 8 presents conclusions and recommendations for future research 
directions. 

2. A Survey of Positioning Systems 

The positioning systems can be broadly classified into three major categories/classes: 

1. systems using signal strength measurements, 

2. systems using time of arrival of radio signal, and 

3. systems using dead reckoning techniques. 

The methods belonging to class 1 and 2 can be collectively called radio-location methods 
because they rely on the properties of radio signals. 

An excellent overview of various positioning approaches and systems is given in [3]. That 
paper introduces major radio location-based positioning techniques and classifies them into 
categories according to their complexity: from basic techniques, such as triangulation, to the 
most advanced and complex techniques, such as those based on angle of arrival, time of arrival, 
and time difference of arrival of the radio signals. 

The most accurate positioning today is achieved using the satellite-based global positioning 
system (GPS) [4]. The principle behind GPS, as a time-of-arrival system, is simple, although 
the implementation is quite complex. GPS uses precise timing within a group of satellites 
and transmits a spread-spectrum signal to earth on L-band. An accurate clock at the receiver 
measures the time between the signals leaving the satellites and arriving at the receiver. If at 
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least three satellites are visible to the receiver, triangulation can be used to find the receiver's 
location. Additional reference stations are used in GPS to improve accuracy further; this is 
called differential GPS. However, GPS has two important disadvantages. First, the information 
on position usually has to be transmitted to some other party requiring that the mobile station 
provides data communication facilities. Most often, it is provided using some sort of radio 
system-based data transfer, such as radio modems which communicate with a specialized 
radio system infrastructure (such as trunked radio), or using an existing public cellular system. 
In the former case, the problem of radio coverage is introduced, which requires investment in 
radio transmission systems. Further costs are incurred by data communication devices. In the 
latter case, the positioning and communication facilities are presently implemented by two 
separate devices. The second disadvantage of GPS is that it is usable only in the case of "clear 
sky", which makes it hardly usable in urban areas, mountainous terrain, and closed/covered 
space. 

Dead reckoning methods locate a vehicle by computing its distance and direction of travel 
from a known fixed initial position. The distance measurements are made using a precision 
odometer and some compass type device to measure azimuth. Factors, such as side winds, 
change in tire pressure, and road conditions, can greatly affect the distance and direction 
measurement. Since the computed location depends upon all previous estimates, errors in 
location tend to accumulate and can quickly lead to sizable position error. To compensate for 
this, the system must be updated on a regular basis. This may be done manually, by proximity 
devices, or by comparing the vehicle's route with known or feasible routes. A system of this 
type, described in [5], uses such a technique corrected by central processor map correlation. 
This type of system has the advantage that it is not susceptible to problems associated with 
radio measurement techniques encountered in an urban environment. 

Many methods and systems have been proposed based on the radio signal strength meas- 
urement [6-8] of a mobile station's transmitter by a set of base stations. Recently, adaptive 
schemes based on the use of cellular systems and on fuzzy logic [9], hidden Markov models, 
and pattern recognition methods [10, 11] have been used to estimate the position of mobiles. 
A study in [9] using computer simulation shows that the error between the exact position and 
the estimated position is in the range of 0 to 575 m. This is based on the assumption that 5 km 
separate the base stations from each other and the point-to-area terrain configuration has a 3 dB 
standard deviation from a normal signal strength distribution. The most recent simulation [12] 
is based on a multidimensional scaling technique that yields some very accurate results. A 
mobile's position is determined in a such way that the measured signal strength of a certain 
base station in the GSM system is best fitted to the known average signal strength at this point. 
The performance of the method was tested by simulation for different simulated scenarios 
[12], but no results from a real cellular environment have been reported. 

Other positioning techniques have been proposed based on time angle of arrival (AO A) or 
time of arrival (TOA) of incident signals. In the AOA case, an estimate of position is made 
from base stations using a directional antenna to measure the AOA of incident signals [13] and 
requires complex adaptive high-resolution analysis techniques. If TOA is used [14], the three- 
dimensional position of the mobile station is uniquely determined by the intersection of three 
spheres. The major problem encountered with this technique is the requirement that all trans- 
mitters and receivers in the system have synchronized clocks. Otherwise, even a very small 
timing error could result in a very large position location error. Finally, the time difference 
of arrival (TDOA) method, which is based on the difference between the time at which the 
signal arrives at multiple base stations and the absolute time, is more practical for commercial 
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systems [15]. This technique requires only fixed base stations to have precisely synchronized 
clocks. Another limitation for all these techniques is that they rely on a direct line-of-sight 
path from the mobile station to the base stations. This is, however, not true for both urban 
and mountainous environments. The absolute time delay also undergoes multipath effects. 
The GSM-based equalization techniques rely on combining the resulting signal to produce 
the least error. This has the effect of distorting the time delay. However, what is required 
for positioning are multipath rejection algorithms. There have been a number of studies in 
this area, especially in the use of Least Mean Square techniques [16] and extended Kalman 
filters [17]. These techniques have proved to be of some value in removing multipath effects. 
Kalman filters in particular have also been used for the purpose of ?averaging? the mobile 
position from a set of fluctuating data, providing excellent results [18]. The investigation of 
the use of Kalman filters in this area, therefore, may lead to promising results. 

Our paper presents the results of the research obtained so far as a part of the project that 
explores an Automatic GSM-based Positioning and Communication System (AGPCS) [19, 
20] using a standard GSM mobile phone in a cellular environment. The fact that a standard 
GSM cellular phone is used for positioning imposes substantial constraints compared to the 
approaches that require additional precise equipment to support positioning. The main con- 
straint is that only received signal strengths from the serving and a number of base stations are 
available for positioning purposes. First, we present the features of the cellular environment 
relevant to our approach to mobile station positioning. We also introduce the artificial neural 
network models that are used in our approach for positioning. Then, we describe our approach 
to positioning using two major general methods. The first approach is based on using a neural 
network for classification that determines the probable area the mobile phone is in. The second 
one uses the function approximation to determine either distances between the mobile phone 
and the neighboring base stations and triangulation to position the phone or the position (lat- 
itude and longitude, or X and Y coordinates) of the mobile station itself. Finally, we describe 
experiments and present results obtained from the application of our models on positioning 
in a real cellular environment and discuss the steps needed to improve the accuracy of our 
models. 

3. AGPCS: GSM-Based Positioning and Communication System 

GSM [21, 22] initially handled basic voice services and some emergency calling features, but 
has already added improvements to subscriber identity module (SIM) cards which contain a 
microchip with information on the caller. From the user point of view, the obvious difference 
between GSM and other cellular technologies is that GSM cellular phones operate only digit- 
ally, enabling both voice and data to be transferred directly digitally, without using modems, 
thus providing the backbone of the mobile communication network. 

A variety of data services are offered in GSM. GSM users can send and receive data, at rates 
up to 9600 baud, to users on POTS, ISDN, Packet Switched PDN, and Circuit Switched PDN 
using a variety of access methods and protocols. Other data services include G3 facsimile and 
Short Message Services (SMS), which is a bi-directional service for short alphanumeric (up to 
160 bytes) messages. Messages are transported in a store-and-forward fashion. For point-to- 
point SMS, a message can be sent to another subscriber and an acknowledgment of receipt is 
provided to the sender. SMS can be used in a cell-broadcast mode for sending messages such 
as updates of different sorts. Messages can also be stored in the SIM card for later retrieval. 
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The SMS service provides a basic means to transfer data used to estimate the position or 
coordinates of the mobile station. 

Besides voice and data services, the GSM system provides data that might be used for radio 
signal strength measurements and positioning. The mobile station must continuously monitor 
the neighboring cells' perceived power levels. To do this, the mobile station receives a list of 
base stations (channels) on which to perform power measurements. The list is transmitted on 
the base channel, which is the first channel a mobile tunes in on when it is turned on. The GSM 
mobile station receives the downlink signal levels from the serving and up to six neighboring 
base stations in a discrete scale each 0.48 seconds. The GSM mobile station applies a complex 
signal-processing algorithm to determine the signal strengths. This information is part of the 
GSM system and is used in our system to estimate the position of the mobile station. 

By using and integrating two inherent features of the GSM system (measurements of radio 
signal levels and ability to communicate directly digitally), we have proposed [19, 20] an 
Automatic GSM-based Positioning and Communication System (AGPCS). The AGPCS is 
a real-time system built on top of the GSM system and can be considered as an application 
layer to standard GSM. It performs the positioning of the mobile station in the coverage area of 
the GSM network. The AGPCS mobile station consists of the GSM mobile station (actually 
a handset) and a mobile computer connected to it. Depending on the power of the mobile 
computer, various degrees of intelligence and application complexity can be achieved within 
the AGPCS mobile station. The AGPCS mobile station performs continuous radio signal 
strength measurements and acquisition of measurements to estimate its position. The position 
is estimated using artificial neural network models that are based on current signal strength 
measurements, history of signal strength measurements, as well as some a priori knowledge 
of the environment, as will be shown in the following sections. The mobile computer collects 
signal strength measurements from serving and up to six neighboring base stations. Then, 
it either estimates the area in which the mobile station lies (using classification properties 
of neural networks), evaluates the distance between the mobile station and the neighboring 
base stations in order to use them in a triangulation model, or determines the position of the 
mobile station directly. This operation is performed in real time. This scenario leads to a self- 
positioning and communication system or SPCS. The SPCS is useful in applications in which 
the AGPCS mobile station and its user want to know their current position, and it can also 
transfer that information to the other parties. 

In the second scenario, the mobile computer has a minimum of intelligence and input/ 
output devices. It is used just to collect signal strength measurements, preprocess them, and 
transmit them to a network center (NC), where they are used to estimate position. A simplified 
version of the positioning model can run on the mobile computer and estimate the distances 
to the base stations, or position, which are then sent to the NC. The NC plays a supervisory 
role in the AGPCS system. Further refinement of position can be done and the corresponding 
database updated. The NC maintains data on the positions of a number of mobile stations and 
provides the means for presenting positions on a geographic map display. However, it can be 
used for various other purposes. The NC and a number of AGPCS mobile stations make up an 
AGPCS system. Obviously, the number of independent AGPCS systems or their architecture is 
not limited, because they depend only on the application requirements. Both scenarios involve 
the transfer of messages between the AGPCS stations or between the stations and the network 
center. This communication is performed without employing GSM voice channels. It is based 
on the short message service (SMS) that provides the exchange of short messages without 
using any additional interface equipment. Due to message latency, there can be delays in the 
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delivery of short messages. In that case, a direct link between the AGPCS mobile station and 
the network center can be established and information on signal measurements or position 
transferred immediately. 

The first working versions of systems using AGPCS technology have been developed and 
tested. The AGPCS technology can be used for various applications, including control, as it 
may be easily incorporated into the standard hardware/software environments or used in the 
embedded form. 



4. Neural Networks and Genetic Algorithms 

It is known that only statistical models [23-28] can describe the received signal strengths at the 
mobile station. Resolving the variations of short-term fading, long-term fading, and the path- 
loss law is required to determine the position from the received signal strengths. The statistical 
models involve complex mathematical manipulations. Moreover, the statistical models are not 
universal. Extensive experiments are required to find appropriate expressions and parameters 
to describe the local behavior, which is time-consuming. 

A much simpler, faster, and more, flexible approach is to use artificial neural networks 
(ANNs) [29]. A wide range of neural network configurations and learning algorithms exists. 
In our work, we have placed a particular emphasis on the Multilayer Perceptron (MLP) neural 
network and its backpropagation learning algorithm. Obtaining an appropriate size is probably 
the most important task if a particular network topology is chosen. This will have profound 
effects on the performance of the network. Optimum network architecture (size) can be found 
using a trial-and-error approach or some optimization technique. The optimization technique 
used in our approach is a genetic algorithm [30]. 

Almost all of the approaches found in the literature to describe the behavior of the signal 
strength use a statistical model. They are usually founded on three assumptions: linearity, 
stationarity, and second-order statistics, with particular emphasis on Gaussianity. Yet, most, if 
not all, physical signals in real-life applications are generated by dynamic processes that are at 
the same time nonlinear, nonstationary, and nonGaussian. One way to analyze these processes 
is to use ANNs. 

ANNs are a paradigm for the intelligent processing of information for some specific object- 
ive, e.g., classification, pattern recognition, decision-making, system behavior identification, 
and prediction. ANNs mimic human learning processes and therefore have great potential as 
adaptive learning systems [31], ANN represents a method of synthesizing a mapping between 
input and output variables by learning a set of arc weights and node thresholds of a connec- 
tionist model based on training examples. They have been developed in a wide variety of 
configurations with some common characteristics, such as massive interconnection of simple 
computational elements. They are characterized by the model of their neurons, the connections 
between them, and the methods to train them to do a specific task. 

ANNs have a number of important properties. Some of these properties are especially 
useful from the point of view of processes that take part in the estimation of the position of 
the mobile station from the signal strengths received from neighboring base stations [32]: 

• ANNs are distributed nonlinear devices - they have the inherent ability to model under- 
lying nonlinearities contained in the physical mechanism responsible for generating the 
input data. 
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• ANNs have the potential to be fault-tolerant in the sense that the performance is degraded 
gracefully under adverse operating conditions. 

• ANNs have the natural ability to adapt their free parameters to statistical changes in the 
environment in which they operate ANNs provide a nonparametric approach to nonlin- 
ear estimation. The term nonparametric is used in a statistical sense, meaning that no 
knowledge of the underlying probability distribution is required. 

• ANNs in a supervised manner are universal approximators. Multilayer feed-forward net- 
works are universal approximators in the sense that they can approximate any continuous 
input-output mapping to any desired degree of approximation given a sufficient number 
of hidden units. 

For the purpose of mobile station positioning, we are interested in static networks whose 
output is a function of the current input. The multilayer perceptron (MLP) is the most widely 
used static neural network. The learning algorithm is the fast backpropagation, which is a 
supervised learning algorithm (backpropagation with a momentum) [29]. 

MLP is capable of approximating arbitrary nonlinear mappings, and given a set of ex- 
amples, the backpropagation algorithm can be called upon to learn the mapping at the example 
point. However, there are a number of practical concerns. The first is the matter of choosing the 
network size. The second is the learning time. The final concern is the ability of the network to 
generalize: its ability to produce accurate results on new samples outside the training set [29]. 
Generalization is most heavily influenced by three parameters: the number of data samples, 
the complexity of the underlying problem, and the network size. The generalization property 
will also determine the learning time of the network. 

A good generalization can only be achieved, if optimum network architecture is found. In 
general, it is not known what size network works best for a given problem. Further, it is not 
likely that this issue will be resolved in the general case because each problem will demand 
different capabilities from the network. If the network is too small, it will not be capable of 
forming a good model of the problem. On the other hand, if the network is too large it will lead 
to overgeneralization and result in poor performance. There is a wide range of optimization 
algorithms to find the size of the network. All of them can be categorized into one of two 
general approaches: 

1. A larger-than-necessary network is used initially and trained until an acceptable solution 
is found. After this, hidden units and weights are removed if they are no longer actively 
used. Methods using this approach are called pruning procedures. 

2. A small network is used initially and then it grows additional units and weights un- 
til a satisfactory solution is found. Methods using this approach are called constructive 
procedures. 

Another possible optimization technique that differs from pruning and constructive procedures 
used in our approach is the use of a genetic algorithm [30]. Genetic algorithms are a family 
of computational models inspired by evolution. These algorithms encode a potential solution 
to a specific problem on a simple chromosome-like data structure and apply recombination 
operators to these structures, in order to preserve critical information. In our approach, the 
input variables and the neural network structures, including the size of the network and the 
activation function used in the network, are encoded into the chromosome. The chromosomes 
are represented as binary strings. Each of these chromosomes is decoded into neural networks. 
The genetic algorithm process is based on a fundamental cyclic process, which consists of: 
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1. Creating an initial population of "genotypes" (genetic representation of the neural net- 
work). 

2. Building neural networks ("phenotypes") based on the genotypes. 

3. Training and testing the neural networks to determine how good they are. 

4. Comparing the fitness of the networks and keeping the best ones. 

5. Selecting those networks in the population that are better and discarding those that are not 
good enough. 

6. Refilling the population back to the defined size. 

7. Pairing up the genotypes of the neural networks. 

8. "Mating" the genotypes by exchanging genes (features) of the networks. 

9. "Mutating" the genotypes in some random fashion and returning to step 2. 

The above cyclic process is called a generation and continues until some stopping criterium 
(required accuracy) is reached or the desired number of generations are performed. The best 
network obtained is the optimum network. The genetic algorithm is used to search for more 
relevant input variables from the provided input space, so the final network will have both 
optimal network architecture and a minimal number of input variables. All neural networks 
that have been investigated in our approach to positioning are 2-layer networks. The activa- 
tion function in the hidden layer is the tangent hyperbolic, tanh, while the genetic algorithm 
determines the type of the activation function in the output layer. The fitness function used in 
step 4 is defined as the root mean squared error on the test set. Root mean squared error is 
calculated between the expected and actual neural outputs and is averaged across all output 
neurons, if more than one is employed. 

5. Positioning Using Classification 

Classification can be used to position the mobile station when the precise location of the 
mobile is not required. When using classification, the area of interest is divided into smaller 
sections and the identification of the section in which the mobile station can be found is the 
task of classification. The classification model and results of its application to the considered 
area are presented in this section. 

5.1. Classification Model 

ANN is used to perform the classification of the received signal strengths which represent 
input signals from a number of base stations' antennas. Experiments were performed in an 
area with a size of approximately 3 km by 4 km in which 4 base stations with ten antennas 
were located; three of them used directional antennas as described in the next section. The 
situation is illustrated in Figure 1. All coordinates are presented in NZMG (New Zealand Map 
Grid) format, which uses X and Y coordinates in meters and is directly related to the usual 
longitudes and latitudes. 

The GSM mobile station provided the signal strength measurements from serving and 
up to six neighboring base stations' antennas. We limited this number to six signal strength 
values, which are collectively called a measurement record. These six values were divided 
into two groups of three measurements each with a time delay of around 2 seconds due to 
the limitation of equipment used. Despite this delay and time difference, we assumed that 
they were collected at the same time. By looking at the data collected at the training and 
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Figure 1. Illustration of the area used in the experiments. 



testing sites, at most 4 of the 10 antennas appeared in the measurement record. The remaining 
readings were from other base stations outside the area included in the experiment. In the 
case of readings obtained from the base stations outside the considered area, we assumed 
that the signal strengths from remaining antennas not present in the record had value 0. This 
obviously represents a further approximation. Nevertheless, since the GSM mobile station 
does not provide the actual values of those measurements, we found it the only practical way 
to tackle the problem. 

For classification purposes, the considered area was first divided into 3 smaller sections 
as illustrated in Figure 2. A neural network was developed for this classification problem and 
used to classify input measurement records. A second model was then developed by dividing 
the same area into 8 smaller sections as shown in Figure 3, and the performance of the two 
models was then compared. Each section was coded using the 1-out-C coding scheme in which 
C represents the number of categories present in the output. 

ANN tries to learn from the input space (signal strengths of various antennas) and the 
corresponding output of the training data to predict the section the mobile station is in from 
the input space of the testing data. The predicted outputs in the test set are computed by 
passing the received signal strengths collected at the testing sites through the optimum network 
discovered by the genetic algorithm. The predicted outputs are then compared to the actual 
outputs. Accuracy of the classification for each site is the average correct classification made 
for all outputs for that site. 

Two cases were considered: 

1. The optimal network found by the genetic algorithm that contains a minimum number of 
inputs (from antennas that influence the classification process), and 
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Figure 2. Division of the area into three sections. 
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Figure 3. Division of the area into eight sections. 



2. The optimal network found by the genetic algorithm that takes into account all inputs (all 
antennas in the considered area) regardless of their presence in the measurement records. 

5.2. Results of Classification Model Application 

In this section, we present the performance of classification models when the considered area 
was divided into 3 and 8 smaller sections. The sections were non-overlapping with sharp 
boundaries between them as shown in Figures 2 and 3. Our aim was to investigate how 
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Table I. Results of the classification of the ANN with 3 and 8 outputs. 
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92.45 
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100 


11 
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98.24 


8 


84.21 
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13 


2 
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40.38 


65.38 


14 


14 
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classification models position mobile stations in given areas and analyze the accuracy of these 
models. Network architectures obtained by the application of a genetic algorithm have a single 
hidden layer and the neuron hyperbolic tangent activation function in the hidden layer. The 
limitation imposed on the genetic algorithm was to investigate ANNs with up to 64 hidden 
neurons. 

First, we considered the case in which the area was divided into 3 smaller sections. As a 
result of the genetic algorithm application, when received signal strengths from 10 antennas 
were used as input variables, the best network architecture contained 48 hidden neurons using 
tanh activation function and 3 output nodes with logistic activation function. The overall cor- 
rect classification rate of the test set was 85.39%. If the optimized neural network was used, the 
genetic algorithm found that the model with only 5 input variables described the classification 
even better. The best network architecture had 5 inputs, 29 hidden nodes with tanh activation 
function, and 3 output nodes with linear activation function. The overall classification rate of 
the test set was 86.62%. The classification rates for all locations used in the experiment for 
both models are presented in Table 1 . 

The results of classification for positioning in Table 1 show that sites 9 and 11 were 
wrongly classified when the considered area was divided into 3 sections and all 10 input 
variables were used. The correct classification rate was fairly high for all locations other than 
these two sites, indicating that there was a high confidence in correct classification. When the 
ANN was used with an optimized number of input variables, the overall classification rate was 
improved and only site 1 1 had a classification rate that can be considered incorrect. 
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The second case is similar to the first one, but the considered area was divided into 8 
smaller sections. The best neural network found for the model with 10 input variables was 
the network with 54 hidden nodes with tanh activation function and 8 output nodes with 
logistic activation function. The overall classification rate in this case dropped to 76.09%. 
If the optimized neural network was used, it again contained only 5 input variables, 49 hidden 
nodes with tanh activation function, and 8 output nodes with linear activation function. The 
overall correct classification rate of the test set was 80.43%. The results of classification for 
all 14 locations are shown in. Table 1. When the area was divided into eight 1 km by 1 km 
sections, the overall classification rate in the test set was reduced to 76.09% for the ANN 
with 10 input variables and to 80.43% for the optimal ANN with 5 input variables. For a 
non-optimized network, classification for site 9 was completely incorrect and for sites 1 1 
and 13 unsatisfactory, while for the optimized input variables case only site 9 was classified 
unsatisfactorily. 

From the results of the application of ANN models, a number of observations can be poin- 
ted out. The correct classification rate in the test set improves when ANNs with an optimized 
number of input variables are used. Also, the performance of ANN with a lower number of 
outputs is better. Despite the sources of error (reduced accuracy), the results have proven the 
feasibility of the approach for mobile station positioning. This approach can be useful if the 
precise location of the mobile is not required, such as first estimate of the position or, for 
example, in an application such as the handover process. One of the possible improvements 
of the classification model is to introduce a degree of overlapping between sections to avoid 
misclassification. This approach will be analyzed in our future experiments. 

6. Positioning Using Function Approximation 

Another approach to achieve the mobile station positioning is to use ANNs as a function 
approximator. We analyzed two methods: 

1. The received signal strengths are used to establish the relationship between the distance 
and angle of the mobile station and the antenna and indirectly take part in positioning. 
Two cases are considered further and used to position the mobile station: 

• estimation of distance and angle from the known antenna site. 

• triangulation, if at least three distances from three antennas belonging to three different 
base stations are known. 

Both these methods will be referred to as positioning with distance prediction. 

2. The received signal strength input variables are used as in the preceding classification 
problem to model directly the two-dimensional coordinates of the mobile station. This 
method will be referred to as direct positioning. 

6.1. Positioning Using Distance Prediction 

Using this approach, we first established the relationship between the received signal strengths 
and the distances to the experimental sites for each antenna. The best network architecture 
discovered by the genetic algorithm for each antenna was first found. Then, the predicted 
distance, the actual distance, and the distance error were determined for each test site. The 
final distance for a particular site was found by running the input data for that site through the 
neural network for that antenna and averaging for the number of samples taken. 
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Table 2. Average absolute angle error for all base stations. 
Base station A B C D 

Average absolute angle error (°) 6.15 14.80 6.30 17.74 



Table 3. Dependence of average angle error on the distance. 



Base station 


A 


B 


C 


D 


Angle (°) 


0.04 


92.94 


6.10 


6.41 


Distance (m) 


1912 


42.78 


2515 


1928 



Another analysis we performed was to explore the existence of a relationship between in- 
puts into the neural network and the angles between the mobile station and corresponding base 
stations. An analysis was performed based on the predicted coordinates from the model with 
the optimized number of input variables together with the correctional network. The actual 
distance of the mobile to each antenna and the error of the predicted angle were examined 
to see if a relationship existed. The predicted angles can be used later both as an input to 
other neural networks and to position the mobile directly. The predicted angle of the mobile 
station to the main beam of each antenna was used together with the signal strength to model 
the distance between the mobile and each antenna. The reason for this is because directional 
antennas are installed in the network and the radiation pattern is not identical in all directions. 
A two-dimensional problem is assumed. 

The best direct positioning neural network with optimized inputs and correctional network 
was used to calculate predicted angles of the estimated positions to the base stations. Apart 
from some outliers, the predicted angles were quite accurate. The average absolute angle error 
ranged from 6.15° to 17.74° as shown in Table 2. 

If we look at the dependence of the average angle error on the distance to the base station, 
it decreases as the distance increases. As an example, the estimated position for a typical site 
6 as a function of the distance from the base stations is illustrated in Table 3. 

Since the predicted angles showed a relatively high degree of accuracy, they were used as 
inputs in another neural network to position the mobile station. The predicted angles could 
also be used for the estimation of the initial position of the mobile station, as a part of the 
other positioning algorithm. 

The relationship between the received signal strength and the distance from the mobile 
station to the base station was modeled for each antenna. Using a genetic algorithm, different 
neural networks were obtained as optimal for each antenna. Predicted angles for each test 
site were passed through the developed networks with the received signal strengths to output 
the predicted distance of the mobile station from a particular antenna. Table 4 illustrates the 
results obtained for site 1 and Table 5 gives the average absolute errors for all sites. 

We applied and analyzed two approaches to position the mobile station using distance 
prediction: 

1. Using the estimated distance and estimated angle of the mobile station to a particular 
antenna. 
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Table 4. Results of neural network application for site l. a 

Antenna Predicted Actual Absolute error 

distance (m) distance (m) (m) 



A3 


1496 


1077 


419 


Bl 


1227 


1365 


138 


B3 


1297 


1365 


68 


CI 


1217 


1969 


752 


C2 


1769 


1969 


200 


C3 


1393 


1969 


576 


D 


1033 


697 


305 



Average distance error 351 

a Antennas with the same initial letter are at the same site 
(base station). 



Table 5. Average absolute errors for distance prediction for all sites. 



Site 1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


Error 351 


767 


173.5 


498.8 


299.5 


670.8 


483.3 


483.9 


305.2 


294.7 


304.3 


653.3 


388.1 


463.5 



2. Using triangulation where distances to 3 antennas from three different base stations are 
required to position the mobile. 

Results of both these approaches are presented below. 
Positioning using estimated distance and estimated angle 

The distance between the mobile station and a particular antenna and the angle of the mobile 
station to that antenna can position the mobile station. Since more than one antenna were 
present at each base station, the final calculated position was obtained by averaging the po- 
sitions from all estimated positions. The final estimated positions for all sites are shown in 
Table 6. 

Positioning using triangulation 

The predicted distances from the mobile station to the three base stations were used to estimate 
the position of the mobile station using triangulation. Table 6 shows errors in the estimated 
positions (in meters). Obviously, this model produces excessive errors. 

6.2. Direct Positioning 

This model cascades two neural networks in a feed-forward manner as shown in Figure 4. The 
input space in the first network was the signal strength of the various antennas, the same as in 
the classification approach. Zeros were used to represent situations in which no antenna was 
found in the measurement record (not one of the neighboring antennas). The output of this 
network represents the position of the mobile station in the NZMG coordinates (Northing, Y, 
and Easting, X). The training and validation procedures for this ANN were the same as for 
the classification model, but the outputs were the coordinates of the mobile station. After the 
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Table 6. Position estimation using estimated distances and angles 
and triangulation. 



Location 


Model using 


Triangulation 




distances and angles 


model 




Distance error 


Distance error 




(m) 


(m) 


1 


447.66 


700.1 


2 


168.90 


1827 


3 


95.68 


235.7 


4 


298.20 


419.9 


5 


348.00 


244.2 


6 


642.86 


812.6 


7 


461.43 


606.4 


8 


292.84 


819.7 


9 


537.96 


382.8 


10 


281.86 


895.6 


11 


290.00 


599.0 


12 


604.65 


2062 


13 


393.00 


788.3 


14 


378.00 


291.8 


Average distance 






error (m) 


374.36 


763.2 



Input signal 
strengths 




Figure 4. Cascading two neural networks to achieve direct positioning. 



training session was completed, the predicted NZMG coordinates of this ANN in the training 
and testing sets were used together with the original signal strengths to train and validate 
a second ANN. The outputs of the second network were again the NZMG coordinates that 
represent the mobile station position. This second network can be considered as a correctional 
network. 
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Table 7. Experimental results of the direct positioning method. 



Model 


Average 
absolute 
error (m) 


Minimum 
error (m) 


Maximum 
error (m) 


Main network 3 


Correctional 
network 


Non-optimal 
without 
correctional 
network 


313.95 


127.34 


659.49 


H.N. = 45 tanh 
O.N. = 2 linear 
Inputs = 10 




Non-optimal with 

correctional 

network 


290.38 


111.02 


505.72 




H.N. = 44 tanh 
O.N. = 2 linear 
Inputs = 10 


Optimal without 

correctional 

network 


jUl.lo 




OjU.OU 


n . in . — j I iann 
O.N. = 2 logistic 
Inputs = 7 




Optimal with 

correctional 

network 


278.79 


74.33 


528.25 




H.N. = 50 tanh 
O.N. = 2 logistic 
Inputs = 7 



a H.N. = hidden node; O.N. = output node. 



Both ANNs were found by using a genetic algorithm. Two situations, as in the classification * 
method, were considered: 

1 . An optimal neural network that considered all input variables and 

2. An optimal neural network with the optimized (minimized) number of inputs. The final 
predicted outputs (coordinates) for each site were found by simple averaging over the 
number of samples taken at that location. 

The results of applying all direct positioning models are summarized in Table 7 for comparison 
purposes. These results were obtained for 14 test sites. 

The cascade of two neural networks tried to achieve a feedback architecture to analyze 
whether performance will improve. Although all networks were individually trained, they 
operate as a single network. From the experiments, it is obvious that the networks with the 
optimized number of inputs improve positioning accuracy. Also, the correctional networks 
improve the average absolute error slightly, but reduce the maximum and minimum errors 
significantly. 

Recently, we noticed that besides the network architecture, presented by the number of 
input variables, hidden layers, and hidden neurons, and the type of activation functions used, 
some neural network parameters also had a large impact on model accuracy. These paramet- 
ers included weight initialization range, learning rate, and momentum. First, a better weight 
initialization range was found. We made an attempt to adjust neural network parameters and 
further improve the results of the best-found network. Then, the network model based on that 
configuration was trained. The initial value of the weight initialization range was [-0.3, +0.3]. 
The motivation for starting with small weights was that large weights tend to prematurely 
saturate units in a network and render them insensitive to the learning process. However, if the 
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Model 



Average distance error (m) 



Direct positioning - Optimal with correctional network 
and adjusted weight initialization 



205.32 



Direct positioning - Optimal with correctional network 



278.79 



Direct positioning - Non-optimal with correctional 
network 

Direct positioning - Optimal without correctional 
network 



290.38 



301.18 



Direct positioning - Non-optimal without correctional 
network 



313.95 



Predicted distances and angles 



374.46 



Tri angulation 



763.20 



weights are too small, a similar problem can occur. Therefore, it is hard to determine what the - 
initial weight value should be. A strategy for choosing the value of initial weight for avoiding 
premature saturation was suggested in [33] and was applied to our best-found network with 4 
the correctional network. The average distance error was improved further for more than 25% 
to 205.3 m, indicating that the better ANN training techniques and tuning of the parameters 
should be investigated. 



7. Analysis of Results and Discussion 

The total number of antennas was 10. The GSM service provider (BellSouth N.Z.) furnished 
the data on the antennas. The area used in experiments was, according to the definition of 
[28], predominantly a suburban one with a mixture of light urban areas. The light urban 
category corresponds to the fringes of the Central Business District and to suburban shopping 
centers, where buildings are usually never higher than two or three levels. Total number of 
measurements taken for each training and test site was around 50. 

Table 8 is used to compare average absolute errors of the estimated positions for all applied 
models. As can be seen, the best results were achieved using the direct positioning method 
with the optimal number of inputs and correctional neural network. Generally, the methods 
with direct positioning are less demanding computationally. 

All results shown in this paper were obtained as average values by passing a number of 
measurements through corresponding neural networks. The actual positions of all experi- 
mental sites were obtained using a GPS unit. Therefore, the error of the GPS system was 
incorporated into the current models. Its influence on our final results could not be analyzed. 
Moreover, our analysis assumed the positioning problem to be two-dimensional, introducing 
another source of error. 
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Despite the number of potential sources of error, the current investigation has pioneered 
a new, fast, and flexible paradigm for positioning using neural networks avoiding complex 
statistical manipulation of the signal variations. 

The results reported in this paper were based on 17 training and 14 testing sites in a 3 km x 
4 km area. Furthermore, the project team knew of only 10 antennas on 4 base stations at the 
time of the experiments. Actual signal strength measurements were limited to the capabilities 
of a standard GSM mobile station. Current input data had no preprocessing. As our primary 
goal was positioning, all data samples were collected in a slowly moving environment. Thus, 
the models cannot be verified for faster moving mobile stations. 

A genetic algorithm search to find the best candidate ANNs usually runs 1-2 hours on a 
Pentium 11-300 PC when restricting the type of ANN to MLP BP and the number of hidden 
neurons to 64. However, once its architecture is determined, the training of the ANN takes 
no longer than two minutes. Estimation of the position for a new measurement record is 
practically instantaneous. 

Future research directions have been determined and are currently being studied. They 
• include a much better data acquisition system and a more accurate and automated system for 
the input of actual positions and measurements, which is necessary to both train and validate 
models. Also, other neural network models and training mechanisms will be further invest- 
igated, especially those that provide better tuning of the networks and shorter training times. 
Another area in which the models described in this paper can be used is to model specific 
areas or streets where handover problems are constantly encountered. If the cellular network 
detects that the mobile station is constantly handed-over between 2 or more base stations, it 
can use the neural network models to detect the position of the mobile to reduce the load of 
the network. 

8. Conclusion 

This paper presents the results related to the positioning of the GSM mobile station using only 
the information present in the mobile station. As such, this positioning system can be used for 
both the positioning of the vehicles or mobile stations and the communication with the other 
parties in the system. Positioning of the mobile station was achieved by using signal strengths 
measured at the mobile station and then processed by artificial neural networks either in the 
mobile station itself or at the remote network center. Two major approaches for positioning 
were presented: using a neural network as a classifier that attempts to determine in which area 
the mobile station lies and using a neural network as a function approximator to determine 
the position of the mobile station. Two further refinements of function approximation were 
analyzed and compared: 1. direct positioning that estimates the position in terms of two- 
dimensional coordinates and 2. indirect positioning that calculates the estimated position from 
the estimated distances of the mobile station from the base stations (or antennas) and/or angles 
to the base stations. Our current experiments show that the model which directly estimates the 
position of the mobile station using a minimal number of inputs relevant for positioning and 
the correctional neural network performs the best. Some of the directions for future research 
were also outlined in this paper. Results obtained so far are encouraging and serve as a good 
basis for future research. 
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